* [PATCH v6 01/42] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-07 18:17 ` [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
` (42 subsequent siblings)
43 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
commit 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by
searching closid_num_dirty_rmid") added logic that causes resctrl to
search for the CLOSID with the fewest dirty cache lines when creating a
new control group, if requested by the arch code. This depends on the
values read from the llc_occupancy counters. The logic is applicable to
architectures where the CLOSID effectively forms part of the monitoring
identifier and so do not allow complete freedom to choose an unused
monitoring identifier for a given CLOSID.
This support missed that some platforms may not have these counters.
This causes a NULL pointer dereference when creating a new control
group as the array was not allocated by dom_data_init().
As this feature isn't necessary on platforms that don't have cache
occupancy monitors, add this to the check that occurs when a new
control group is allocated.
Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
The existing code is not selected by any upstream platform, it makes
no sense to backport this patch to stable.
Changes since v1:
* [Commit message only] Reword the first paragraph to make it clear
that the issue being fixed wasn't directly associated with addition
of a Kconfig option. (Actually, the option is not in Kconfig yet,
and gets added later in this series.)
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6419e04d8a7b..04b653d613e8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -157,7 +157,8 @@ static int closid_alloc(void)
lockdep_assert_held(&rdtgroup_mutex);
- if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
+ if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
+ is_llc_occupancy_enabled()) {
cleanest_closid = resctrl_find_cleanest_closid();
if (cleanest_closid < 0)
return cleanest_closid;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2025-02-07 18:17 ` [PATCH v6 01/42] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-27 20:24 ` Moger, Babu
2025-02-07 18:17 ` [PATCH v6 03/42] x86/resctrl: Remove fflags from struct rdt_resource James Morse
` (41 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
Resctrl occasionally wants to know something about a specific resource,
in these cases it reaches into the arch code's rdt_resources_all[]
array.
Once the filesystem parts of resctrl are moved to /fs/, this means it
will need visibility of the architecture specific struct
rdt_hw_resource definition, and the array of all resources. All
architectures would also need a r_resctrl member in this struct.
Instead, abstract this via a helper to allow architectures to do
different things here. Move the level enum to the resctrl header and
add a helper to retrieve the struct rdt_resource by 'rid'.
resctrl_arch_get_resource() should not return NULL for any value in
the enum, it may instead return a dummy resource that is
!alloc_enabled && !mon_enabled.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Change since v5:
* Ensure rdt_resources_all[] is always padded to RDT_NUM_RESOURCES
Changes since v1:
* Backed out non-functional renaming of "r" to "l3" in rdt_get_tree(),
and unhoisted the assignment of r (as now is) back into the if ()
where it started out. There seem to be no uses of this variable
outside this if().
* [Commit message only] Typo fix:
s/resctrl_hw_resource/rdt_hw_resource/g
---
arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 10 ----------
arch/x86/kernel/cpu/resctrl/monitor.c | 8 ++++----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
include/linux/resctrl.h | 17 +++++++++++++++++
6 files changed, 39 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 3d1735ed8d1f..12b41316d8f7 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -62,7 +62,7 @@ static void mba_wrmsr_amd(struct msr_param *m);
#define ctrl_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.ctrl_domains)
#define mon_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.mon_domains)
-struct rdt_hw_resource rdt_resources_all[] = {
+struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
[RDT_RESOURCE_L3] =
{
.r_resctrl = {
@@ -127,6 +127,14 @@ u32 resctrl_arch_system_num_rmid_idx(void)
return r->num_rmid;
}
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
+{
+ if (l >= RDT_NUM_RESOURCES)
+ return NULL;
+
+ return &rdt_resources_all[l].r_resctrl;
+}
+
/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
* as they do not have CPUID enumeration support for Cache allocation.
@@ -174,7 +182,7 @@ static inline void cache_alloc_hsw_probe(void)
bool is_mba_sc(struct rdt_resource *r)
{
if (!r)
- return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
/*
* The software controller support is only applicable to MBA resource.
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 536351159cc2..4af27ef5a8a1 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -649,7 +649,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
resid = md.u.rid;
domid = md.u.domid;
evtid = md.u.evtid;
- r = &rdt_resources_all[resid].r_resctrl;
+ r = resctrl_arch_get_resource(resid);
if (md.u.sum) {
/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 20c898f09b7e..75252a7e1ebc 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -512,16 +512,6 @@ extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
extern enum resctrl_event_id mba_mbps_default_event;
-enum resctrl_res_level {
- RDT_RESOURCE_L3,
- RDT_RESOURCE_L2,
- RDT_RESOURCE_MBA,
- RDT_RESOURCE_SMBA,
-
- /* Must be the last */
- RDT_NUM_RESOURCES,
-};
-
static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 94a1d9780461..58b5b21349a8 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -365,7 +365,7 @@ static void limbo_release_entry(struct rmid_entry *entry)
*/
void __check_limbo(struct rdt_mon_domain *d, bool force_free)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
struct rmid_entry *entry;
u32 idx, cur_idx = 1;
@@ -521,7 +521,7 @@ int alloc_rmid(u32 closid)
static void add_rmid_to_limbo(struct rmid_entry *entry)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_mon_domain *d;
u32 idx;
@@ -761,7 +761,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
struct rdtgroup *entry;
u32 cur_bw, user_bw;
- r_mba = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
evt_id = rgrp->mba_mbps_event;
closid = rgrp->closid;
@@ -925,7 +925,7 @@ void mbm_handle_overflow(struct work_struct *work)
if (!resctrl_mounted || !resctrl_arch_mon_capable())
goto out_unlock;
- r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
d = container_of(work, struct rdt_mon_domain, mbm_over.work);
list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 04b653d613e8..45093b9e8e63 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2256,7 +2256,7 @@ static void l2_qos_cfg_update(void *arg)
static inline bool is_mba_linear(void)
{
- return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.delay_linear;
+ return resctrl_arch_get_resource(RDT_RESOURCE_MBA)->membw.delay_linear;
}
static int set_cache_qos_cfg(int level, bool enable)
@@ -2346,8 +2346,8 @@ static void mba_sc_domain_destroy(struct rdt_resource *r,
*/
static bool supports_mba_mbps(void)
{
- struct rdt_resource *rmbm = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ struct rdt_resource *rmbm = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
return (is_mbm_enabled() &&
r->alloc_capable && is_mba_linear() &&
@@ -2360,7 +2360,7 @@ static bool supports_mba_mbps(void)
*/
static int set_mba_sc(bool mba_sc)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
u32 num_closid = resctrl_arch_get_num_closid(r);
struct rdt_ctrl_domain *d;
unsigned long fflags;
@@ -2714,7 +2714,7 @@ static int rdt_get_tree(struct fs_context *fc)
resctrl_mounted = true;
if (is_mbm_enabled()) {
- r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
list_for_each_entry(dom, &r->mon_domains, hdr.list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
RESCTRL_PICK_ANY_CPU);
@@ -3951,7 +3951,7 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
seq_puts(seq, ",cdpl2");
- if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
+ if (is_mba_sc(resctrl_arch_get_resource(RDT_RESOURCE_MBA)))
seq_puts(seq, ",mba_MBps");
if (resctrl_debug)
@@ -4151,7 +4151,7 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
void resctrl_offline_cpu(unsigned int cpu)
{
- struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *l3 = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_mon_domain *d;
struct rdtgroup *rdtgrp;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index d94abba1c716..37279e2a89da 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -37,6 +37,16 @@ enum resctrl_conf_type {
CDP_DATA,
};
+enum resctrl_res_level {
+ RDT_RESOURCE_L3,
+ RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,
+
+ /* Must be the last */
+ RDT_NUM_RESOURCES,
+};
+
#define CDP_NUM_TYPES (CDP_DATA + 1)
/*
@@ -226,6 +236,13 @@ struct rdt_resource {
bool cdp_capable;
};
+/*
+ * Get the resource that exists at this level. If the level is not supported
+ * a dummy/not-capable resource can be returned. Levels >= RDT_NUM_RESOURCES
+ * will return NULL.
+ */
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
+
/**
* struct resctrl_schema - configuration abilities of a resource presented to
* user-space
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2025-02-07 18:17 ` [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2025-02-27 20:24 ` Moger, Babu
2025-02-28 19:53 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Moger, Babu @ 2025-02-27 20:24 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi James,
You probably missed few cases here.
static int logical_rmid_to_physical_rmid(int cpu, int lrmid)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
--
}
Any reason not use the new call resctrl_arch_get_resource() here?
Thanks
Babu
On 2/7/25 12:17, James Morse wrote:
> Resctrl occasionally wants to know something about a specific resource,
> in these cases it reaches into the arch code's rdt_resources_all[]
> array.
>
> Once the filesystem parts of resctrl are moved to /fs/, this means it
> will need visibility of the architecture specific struct
> rdt_hw_resource definition, and the array of all resources. All
> architectures would also need a r_resctrl member in this struct.
>
> Instead, abstract this via a helper to allow architectures to do
> different things here. Move the level enum to the resctrl header and
> add a helper to retrieve the struct rdt_resource by 'rid'.
>
> resctrl_arch_get_resource() should not return NULL for any value in
> the enum, it may instead return a dummy resource that is
> !alloc_enabled && !mon_enabled.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Change since v5:
> * Ensure rdt_resources_all[] is always padded to RDT_NUM_RESOURCES
>
> Changes since v1:
> * Backed out non-functional renaming of "r" to "l3" in rdt_get_tree(),
> and unhoisted the assignment of r (as now is) back into the if ()
> where it started out. There seem to be no uses of this variable
> outside this if().
> * [Commit message only] Typo fix:
> s/resctrl_hw_resource/rdt_hw_resource/g
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
> arch/x86/kernel/cpu/resctrl/internal.h | 10 ----------
> arch/x86/kernel/cpu/resctrl/monitor.c | 8 ++++----
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
> include/linux/resctrl.h | 17 +++++++++++++++++
> 6 files changed, 39 insertions(+), 24 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 3d1735ed8d1f..12b41316d8f7 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -62,7 +62,7 @@ static void mba_wrmsr_amd(struct msr_param *m);
> #define ctrl_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.ctrl_domains)
> #define mon_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.mon_domains)
>
> -struct rdt_hw_resource rdt_resources_all[] = {
> +struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
> [RDT_RESOURCE_L3] =
> {
> .r_resctrl = {
> @@ -127,6 +127,14 @@ u32 resctrl_arch_system_num_rmid_idx(void)
> return r->num_rmid;
> }
>
> +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
> +{
> + if (l >= RDT_NUM_RESOURCES)
> + return NULL;
> +
> + return &rdt_resources_all[l].r_resctrl;
> +}
> +
> /*
> * cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
> * as they do not have CPUID enumeration support for Cache allocation.
> @@ -174,7 +182,7 @@ static inline void cache_alloc_hsw_probe(void)
> bool is_mba_sc(struct rdt_resource *r)
> {
> if (!r)
> - return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
> + r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
>
> /*
> * The software controller support is only applicable to MBA resource.
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 536351159cc2..4af27ef5a8a1 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -649,7 +649,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> resid = md.u.rid;
> domid = md.u.domid;
> evtid = md.u.evtid;
> - r = &rdt_resources_all[resid].r_resctrl;
> + r = resctrl_arch_get_resource(resid);
>
> if (md.u.sum) {
> /*
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 20c898f09b7e..75252a7e1ebc 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -512,16 +512,6 @@ extern struct rdtgroup rdtgroup_default;
> extern struct dentry *debugfs_resctrl;
> extern enum resctrl_event_id mba_mbps_default_event;
>
> -enum resctrl_res_level {
> - RDT_RESOURCE_L3,
> - RDT_RESOURCE_L2,
> - RDT_RESOURCE_MBA,
> - RDT_RESOURCE_SMBA,
> -
> - /* Must be the last */
> - RDT_NUM_RESOURCES,
> -};
> -
> static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
> {
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 94a1d9780461..58b5b21349a8 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -365,7 +365,7 @@ static void limbo_release_entry(struct rmid_entry *entry)
> */
> void __check_limbo(struct rdt_mon_domain *d, bool force_free)
> {
> - struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> u32 idx_limit = resctrl_arch_system_num_rmid_idx();
> struct rmid_entry *entry;
> u32 idx, cur_idx = 1;
> @@ -521,7 +521,7 @@ int alloc_rmid(u32 closid)
>
> static void add_rmid_to_limbo(struct rmid_entry *entry)
> {
> - struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> struct rdt_mon_domain *d;
> u32 idx;
>
> @@ -761,7 +761,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> struct rdtgroup *entry;
> u32 cur_bw, user_bw;
>
> - r_mba = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
> + r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
> evt_id = rgrp->mba_mbps_event;
>
> closid = rgrp->closid;
> @@ -925,7 +925,7 @@ void mbm_handle_overflow(struct work_struct *work)
> if (!resctrl_mounted || !resctrl_arch_mon_capable())
> goto out_unlock;
>
> - r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> d = container_of(work, struct rdt_mon_domain, mbm_over.work);
>
> list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 04b653d613e8..45093b9e8e63 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2256,7 +2256,7 @@ static void l2_qos_cfg_update(void *arg)
>
> static inline bool is_mba_linear(void)
> {
> - return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.delay_linear;
> + return resctrl_arch_get_resource(RDT_RESOURCE_MBA)->membw.delay_linear;
> }
>
> static int set_cache_qos_cfg(int level, bool enable)
> @@ -2346,8 +2346,8 @@ static void mba_sc_domain_destroy(struct rdt_resource *r,
> */
> static bool supports_mba_mbps(void)
> {
> - struct rdt_resource *rmbm = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> - struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
> + struct rdt_resource *rmbm = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
>
> return (is_mbm_enabled() &&
> r->alloc_capable && is_mba_linear() &&
> @@ -2360,7 +2360,7 @@ static bool supports_mba_mbps(void)
> */
> static int set_mba_sc(bool mba_sc)
> {
> - struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
> u32 num_closid = resctrl_arch_get_num_closid(r);
> struct rdt_ctrl_domain *d;
> unsigned long fflags;
> @@ -2714,7 +2714,7 @@ static int rdt_get_tree(struct fs_context *fc)
> resctrl_mounted = true;
>
> if (is_mbm_enabled()) {
> - r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> list_for_each_entry(dom, &r->mon_domains, hdr.list)
> mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
> RESCTRL_PICK_ANY_CPU);
> @@ -3951,7 +3951,7 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
> if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
> seq_puts(seq, ",cdpl2");
>
> - if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
> + if (is_mba_sc(resctrl_arch_get_resource(RDT_RESOURCE_MBA)))
> seq_puts(seq, ",mba_MBps");
>
> if (resctrl_debug)
> @@ -4151,7 +4151,7 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
>
> void resctrl_offline_cpu(unsigned int cpu)
> {
> - struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + struct rdt_resource *l3 = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> struct rdt_mon_domain *d;
> struct rdtgroup *rdtgrp;
>
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index d94abba1c716..37279e2a89da 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -37,6 +37,16 @@ enum resctrl_conf_type {
> CDP_DATA,
> };
>
> +enum resctrl_res_level {
> + RDT_RESOURCE_L3,
> + RDT_RESOURCE_L2,
> + RDT_RESOURCE_MBA,
> + RDT_RESOURCE_SMBA,
> +
> + /* Must be the last */
> + RDT_NUM_RESOURCES,
> +};
> +
> #define CDP_NUM_TYPES (CDP_DATA + 1)
>
> /*
> @@ -226,6 +236,13 @@ struct rdt_resource {
> bool cdp_capable;
> };
>
> +/*
> + * Get the resource that exists at this level. If the level is not supported
> + * a dummy/not-capable resource can be returned. Levels >= RDT_NUM_RESOURCES
> + * will return NULL.
> + */
> +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
> +
> /**
> * struct resctrl_schema - configuration abilities of a resource presented to
> * user-space
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2025-02-27 20:24 ` Moger, Babu
@ 2025-02-28 19:53 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: babu.moger, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Babu,
(please don't top post!)
On 27/02/2025 20:24, Moger, Babu wrote:
> On 2/7/25 12:17, James Morse wrote:
>> Resctrl occasionally wants to know something about a specific resource,
>> in these cases it reaches into the arch code's rdt_resources_all[]
>> array.
>>
>> Once the filesystem parts of resctrl are moved to /fs/, this means it
>> will need visibility of the architecture specific struct
>> rdt_hw_resource definition, and the array of all resources. All
>> architectures would also need a r_resctrl member in this struct.
>>
>> Instead, abstract this via a helper to allow architectures to do
>> different things here. Move the level enum to the resctrl header and
>> add a helper to retrieve the struct rdt_resource by 'rid'.
>>
>> resctrl_arch_get_resource() should not return NULL for any value in
>> the enum, it may instead return a dummy resource that is
>> !alloc_enabled && !mon_enabled.
> You probably missed few cases here.
>
> static int logical_rmid_to_physical_rmid(int cpu, int lrmid)
> {
> struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> --
>
> }
>
> Any reason not use the new call resctrl_arch_get_resource() here?
Simply there is no need to! That code doesn't get moved to /fs/, so it doesn't need to use
a helper. Taken to its extreme - after the code is moved to /fs/, there are no callers of
resctrl_arch_get_resource() in the arch code, because whatever they do today is sufficent.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 03/42] x86/resctrl: Remove fflags from struct rdt_resource
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2025-02-07 18:17 ` [PATCH v6 01/42] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
2025-02-07 18:17 ` [PATCH v6 02/42] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 21:48 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values James Morse
` (40 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The resctrl arch code specifies whether a resource controls a cache or
memory using the fflags field. This field is then used by resctrl to
determine which files should be exposed in the filesystem.
Allowing the architecture to pick this value means the RFTYPE_
flags have to be in a shared header, and allows an architecture
to create a combination that resctrl does not support.
Remove the fflags field, and pick the value based on the resource
id.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Made fflags_from_resource() return an unsigned long.
* Removed a space.
Changes since v4:
* Removed an extra space
* Fixed a typo
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++++++--
include/linux/resctrl.h | 2 --
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 12b41316d8f7..8ef2e449b465 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -74,7 +74,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -88,7 +87,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -102,7 +100,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
},
},
[RDT_RESOURCE_SMBA] =
@@ -114,7 +111,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
},
},
};
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 45093b9e8e63..3391ac8ecb2d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2165,6 +2165,20 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
return ret;
}
+static unsigned long fflags_from_resource(struct rdt_resource *r)
+{
+ switch (r->rid) {
+ case RDT_RESOURCE_L3:
+ case RDT_RESOURCE_L2:
+ return RFTYPE_RES_CACHE;
+ case RDT_RESOURCE_MBA:
+ case RDT_RESOURCE_SMBA:
+ return RFTYPE_RES_MB;
+ }
+
+ return WARN_ON_ONCE(1);
+}
+
static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
{
struct resctrl_schema *s;
@@ -2185,14 +2199,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
/* loop over enabled controls, these are all alloc_capable */
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- fflags = r->fflags | RFTYPE_CTRL_INFO;
+ fflags = fflags_from_resource(r) | RFTYPE_CTRL_INFO;
ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}
for_each_mon_capable_rdt_resource(r) {
- fflags = r->fflags | RFTYPE_MON_INFO;
+ fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
sprintf(name, "%s_MON", r->name);
ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 37279e2a89da..496ddcaa4ecf 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -210,7 +210,6 @@ enum resctrl_scope {
* @format_str: Per resource format string to show domain value
* @parse_ctrlval: Per resource function pointer to parse control values
* @evt_list: List of monitoring events
- * @fflags: flags to choose base and info files
* @cdp_capable: Is the CDP feature available on this resource
*/
struct rdt_resource {
@@ -232,7 +231,6 @@ struct rdt_resource {
struct resctrl_schema *s,
struct rdt_ctrl_domain *d);
struct list_head evt_list;
- unsigned long fflags;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 03/42] x86/resctrl: Remove fflags from struct rdt_resource
2025-02-07 18:17 ` [PATCH v6 03/42] x86/resctrl: Remove fflags from struct rdt_resource James Morse
@ 2025-02-19 21:48 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 21:48 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> The resctrl arch code specifies whether a resource controls a cache or
> memory using the fflags field. This field is then used by resctrl to
> determine which files should be exposed in the filesystem.
>
> Allowing the architecture to pick this value means the RFTYPE_
> flags have to be in a shared header, and allows an architecture
> to create a combination that resctrl does not support.
>
> Remove the fflags field, and pick the value based on the resource
> id.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (2 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 03/42] x86/resctrl: Remove fflags from struct rdt_resource James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 21:52 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 05/42] x86/resctrl: Use schema type to determine the schema format string James Morse
` (39 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Resctrl's architecture code gets to specify a function pointer that is
used when parsing schema entries. This is expected to be one of two
helpers from the filesystem code.
Setting this function pointer allows the architecture code to change
the ABI resctrl presents to user-space, and forces resctrl to expose
these helpers.
Instead, add a schema format enum to choose which schema parser to
use. This allows the helpers to be made static and the structs used
for passing arguments moved out of shared headers.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Merged the contents of get_parser() with its only caller.
* Removed the description of what the range schema is used for.
* Waggled some whitespace.
Changes since v4:
* Creation of the enum moves into this patch - review tags not picked up.
* Removed some whitespace.
Changes since v3:
* Removed a spurious semicolon
Changes since v2:
* This patch is new
---
arch/x86/kernel/cpu/resctrl/core.c | 8 +++---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 32 +++++++++++++++++++----
arch/x86/kernel/cpu/resctrl/internal.h | 10 -------
include/linux/resctrl.h | 17 ++++++++----
4 files changed, 43 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 8ef2e449b465..e9fe129a02f8 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -72,7 +72,7 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.mon_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L3),
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
- .parse_ctrlval = parse_cbm,
+ .schema_fmt = RESCTRL_SCHEMA_BITMAP,
.format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L3_CBM_BASE,
@@ -85,7 +85,7 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.name = "L2",
.ctrl_scope = RESCTRL_L2_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
- .parse_ctrlval = parse_cbm,
+ .schema_fmt = RESCTRL_SCHEMA_BITMAP,
.format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L2_CBM_BASE,
@@ -98,7 +98,7 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.name = "MB",
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
- .parse_ctrlval = parse_bw,
+ .schema_fmt = RESCTRL_SCHEMA_RANGE,
.format_str = "%d=%*u",
},
},
@@ -109,7 +109,7 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.name = "SMBA",
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
- .parse_ctrlval = parse_bw,
+ .schema_fmt = RESCTRL_SCHEMA_RANGE,
.format_str = "%d=%*u",
},
},
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 4af27ef5a8a1..f4334f437ffc 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -23,6 +23,15 @@
#include "internal.h"
+struct rdt_parse_data {
+ struct rdtgroup *rdtgrp;
+ char *buf;
+};
+
+typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
+ struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d);
+
/*
* Check whether MBA bandwidth percentage value is correct. The value is
* checked against the minimum and max bandwidth values specified by the
@@ -64,8 +73,8 @@ static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
return true;
}
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
struct resctrl_staged_config *cfg;
u32 closid = data->rdtgrp->closid;
@@ -143,8 +152,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
struct rdtgroup *rdtgrp = data->rdtgrp;
struct resctrl_staged_config *cfg;
@@ -210,6 +219,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
enum resctrl_conf_type t = s->conf_type;
+ ctrlval_parser_t *parse_ctrlval = NULL;
struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
struct rdt_parse_data data;
@@ -220,6 +230,18 @@ static int parse_line(char *line, struct resctrl_schema *s,
/* Walking r->domains, ensure it can't race with cpuhp */
lockdep_assert_cpus_held();
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ parse_ctrlval = &parse_cbm;
+ break;
+ case RESCTRL_SCHEMA_RANGE:
+ parse_ctrlval = &parse_bw;
+ break;
+ }
+
+ if (WARN_ON_ONCE(!parse_ctrlval))
+ return -EINVAL;
+
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
(r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) {
rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n");
@@ -240,7 +262,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
if (d->hdr.id == dom_id) {
data.buf = dom;
data.rdtgrp = rdtgrp;
- if (r->parse_ctrlval(&data, s, d))
+ if (parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
cfg = &d->staged_config[t];
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 75252a7e1ebc..b5543bd506c3 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -459,11 +459,6 @@ static inline bool is_mbm_event(int e)
e <= QOS_L3_MBM_LOCAL_EVENT_ID);
}
-struct rdt_parse_data {
- struct rdtgroup *rdtgrp;
- char *buf;
-};
-
/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
* @r_resctrl: Attributes of the resource used directly by resctrl.
@@ -500,11 +495,6 @@ static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r
return container_of(r, struct rdt_hw_resource, r_resctrl);
}
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
-
extern struct mutex rdtgroup_mutex;
extern struct rdt_hw_resource rdt_resources_all[];
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 496ddcaa4ecf..aed231a6d30c 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -183,7 +183,6 @@ struct resctrl_membw {
u32 *mb_map;
};
-struct rdt_parse_data;
struct resctrl_schema;
enum resctrl_scope {
@@ -192,6 +191,16 @@ enum resctrl_scope {
RESCTRL_L3_NODE,
};
+/**
+ * enum resctrl_schema_fmt - The format user-space provides for a schema.
+ * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
+ * @RESCTRL_SCHEMA_RANGE: The schema is a decimal number,
+ */
+enum resctrl_schema_fmt {
+ RESCTRL_SCHEMA_BITMAP,
+ RESCTRL_SCHEMA_RANGE,
+};
+
/**
* struct rdt_resource - attributes of a resctrl resource
* @rid: The index of the resource
@@ -208,7 +217,7 @@ enum resctrl_scope {
* @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @format_str: Per resource format string to show domain value
- * @parse_ctrlval: Per resource function pointer to parse control values
+ * @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
*/
@@ -227,9 +236,7 @@ struct rdt_resource {
int data_width;
u32 default_ctrl;
const char *format_str;
- int (*parse_ctrlval)(struct rdt_parse_data *data,
- struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
+ enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values
2025-02-07 18:17 ` [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2025-02-19 21:52 ` Reinette Chatre
2025-02-28 19:50 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 21:52 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> Resctrl's architecture code gets to specify a function pointer that is
> used when parsing schema entries. This is expected to be one of two
> helpers from the filesystem code.
>
> Setting this function pointer allows the architecture code to change
> the ABI resctrl presents to user-space, and forces resctrl to expose
> these helpers.
>
> Instead, add a schema format enum to choose which schema parser to
> use. This allows the helpers to be made static and the structs used
> for passing arguments moved out of shared headers.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
...
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 496ddcaa4ecf..aed231a6d30c 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -183,7 +183,6 @@ struct resctrl_membw {
> u32 *mb_map;
> };
>
> -struct rdt_parse_data;
> struct resctrl_schema;
>
> enum resctrl_scope {
> @@ -192,6 +191,16 @@ enum resctrl_scope {
> RESCTRL_L3_NODE,
> };
>
> +/**
> + * enum resctrl_schema_fmt - The format user-space provides for a schema.
> + * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
> + * @RESCTRL_SCHEMA_RANGE: The schema is a decimal number,
Nit: Please let sentence end with a period.
> + */
> +enum resctrl_schema_fmt {
> + RESCTRL_SCHEMA_BITMAP,
> + RESCTRL_SCHEMA_RANGE,
> +};
> +
> /**
> * struct rdt_resource - attributes of a resctrl resource
> * @rid: The index of the resource
> @@ -208,7 +217,7 @@ enum resctrl_scope {
> * @data_width: Character width of data when displaying
> * @default_ctrl: Specifies default cache cbm or memory B/W percent.
> * @format_str: Per resource format string to show domain value
> - * @parse_ctrlval: Per resource function pointer to parse control values
> + * @schema_fmt: Which format string and parser is used for this schema.
> * @evt_list: List of monitoring events
> * @cdp_capable: Is the CDP feature available on this resource
> */
> @@ -227,9 +236,7 @@ struct rdt_resource {
> int data_width;
> u32 default_ctrl;
> const char *format_str;
> - int (*parse_ctrlval)(struct rdt_parse_data *data,
> - struct resctrl_schema *s,
> - struct rdt_ctrl_domain *d);
> + enum resctrl_schema_fmt schema_fmt;
> struct list_head evt_list;
> bool cdp_capable;
> };
| Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values
2025-02-19 21:52 ` Reinette Chatre
@ 2025-02-28 19:50 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:50 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 19/02/2025 21:52, Reinette Chatre wrote:
> On 2/7/25 10:17 AM, James Morse wrote:
>> Resctrl's architecture code gets to specify a function pointer that is
>> used when parsing schema entries. This is expected to be one of two
>> helpers from the filesystem code.
>>
>> Setting this function pointer allows the architecture code to change
>> the ABI resctrl presents to user-space, and forces resctrl to expose
>> these helpers.
>>
>> Instead, add a schema format enum to choose which schema parser to
>> use. This allows the helpers to be made static and the structs used
>> for passing arguments moved out of shared headers.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 496ddcaa4ecf..aed231a6d30c 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -192,6 +191,16 @@ enum resctrl_scope {
>> RESCTRL_L3_NODE,
>> };
>>
>> +/**
>> + * enum resctrl_schema_fmt - The format user-space provides for a schema.
>> + * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
>> + * @RESCTRL_SCHEMA_RANGE: The schema is a decimal number,
>
> Nit: Please let sentence end with a period.
... me and my fat fingers ...
>> + */
>> +enum resctrl_schema_fmt {
>> + RESCTRL_SCHEMA_BITMAP,
>> + RESCTRL_SCHEMA_RANGE,
>> +};
>> +
>> /**
>> * struct rdt_resource - attributes of a resctrl resource
>> * @rid: The index of the resource
> | Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 05/42] x86/resctrl: Use schema type to determine the schema format string
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (3 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 04/42] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-07 18:17 ` [PATCH v6 06/42] x86/resctrl: Remove data_width and the tabular format James Morse
` (38 subsequent siblings)
43 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Resctrl's architecture code gets to specify a format string that is
used when printing schema entries. This is expected to be one of two
values that the filesystem code supports.
Setting this format string allows the architecture code to change
the ABI resctrl presents to user-space.
Instead, use the schema format enum to choose which format string to
use.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Added runtime check for resource schema_format being junk.
Change since v4:
* Added a stop to a struct comment.
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 ++++++++++++++
include/linux/resctrl.h | 4 ++--
4 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index e9fe129a02f8..542503a8c953 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -73,7 +73,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L3),
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
.schema_fmt = RESCTRL_SCHEMA_BITMAP,
- .format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -86,7 +85,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_scope = RESCTRL_L2_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
.schema_fmt = RESCTRL_SCHEMA_BITMAP,
- .format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -99,7 +97,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
.schema_fmt = RESCTRL_SCHEMA_RANGE,
- .format_str = "%d=%*u",
},
},
[RDT_RESOURCE_SMBA] =
@@ -110,7 +107,6 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
.schema_fmt = RESCTRL_SCHEMA_RANGE,
- .format_str = "%d=%*u",
},
},
};
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index f4334f437ffc..c763cb4fb1a8 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -487,7 +487,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
ctrl_val = resctrl_arch_get_config(r, dom, closid,
schema->conf_type);
- seq_printf(s, r->format_str, dom->hdr.id, max_data_width,
+ seq_printf(s, schema->fmt_str, dom->hdr.id, max_data_width,
ctrl_val);
sep = true;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3391ac8ecb2d..e7862d0936c9 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2611,6 +2611,20 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
if (cl > max_name_width)
max_name_width = cl;
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ s->fmt_str = "%d=%0*x";
+ break;
+ case RESCTRL_SCHEMA_RANGE:
+ s->fmt_str = "%d=%0*u";
+ break;
+ }
+
+ if (WARN_ON_ONCE(!s->fmt_str)) {
+ kfree(s);
+ return -EINVAL;
+ }
+
INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index aed231a6d30c..547f47065096 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -216,7 +216,6 @@ enum resctrl_schema_fmt {
* @name: Name to use in "schemata" file.
* @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
- * @format_str: Per resource format string to show domain value
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
@@ -235,7 +234,6 @@ struct rdt_resource {
char *name;
int data_width;
u32 default_ctrl;
- const char *format_str;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
@@ -253,6 +251,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
* user-space
* @list: Member of resctrl_schema_all.
* @name: The name to use in the "schemata" file.
+ * @fmt_str: Format string to show domain value.
* @conf_type: Whether this schema is specific to code/data.
* @res: The resource structure exported by the architecture to describe
* the hardware that is configured by this schema.
@@ -263,6 +262,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
struct resctrl_schema {
struct list_head list;
char name[8];
+ const char *fmt_str;
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
u32 num_closid;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* [PATCH v6 06/42] x86/resctrl: Remove data_width and the tabular format
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (4 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 05/42] x86/resctrl: Use schema type to determine the schema format string James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 21:56 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 07/42] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
` (37 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The resctrl architecture code provides a data_width for the controls of
each resource. This is used to zero pad all control values in the schemata
file so they appear in columns. The same is done with the resource names
to complete the visual effect. e.g.
| SMBA:0=2048
| L3:0=00ff
AMD platforms discover their maximum bandwidth for the MB resource from
firmware, but hard-code the data_width to 4. If the maximum bandwidth
requires more digits - the tabular format is silently broken.
This is also broken when the mba_MBps mount option is used as the
field width isn't updated. If new schema are added resctrl will need
to be able to determine the maximum width. The benefit of this
pretty-printing is questionable.
Instead of handling runtime discovery of the data_width for AMD platforms,
remove the feature. These fields are always zero padded so should be
harmless to remove if the whole field has been treated as a number.
In the above example, this would now look like this:
| SMBA:0=2048
| L3:0=ff
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 26 -----------------------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 3 +--
arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++++--
include/linux/resctrl.h | 2 --
5 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 542503a8c953..754fb65565ec 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -43,12 +43,6 @@ static DEFINE_MUTEX(domain_list_lock);
*/
DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state);
-/*
- * Used to store the max resource name width and max resource data width
- * to display the schemata in a tabular format
- */
-int max_name_width, max_data_width;
-
/*
* Global boolean for rdt_alloc which is true if any
* resource allocation is enabled.
@@ -228,7 +222,6 @@ static __init bool __get_mem_config_intel(struct rdt_resource *r)
return false;
r->membw.arch_needs_linear = false;
}
- r->data_width = 3;
if (boot_cpu_has(X86_FEATURE_PER_THREAD_MBA))
r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
@@ -269,8 +262,6 @@ static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)
r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
r->membw.min_bw = 0;
r->membw.bw_gran = 1;
- /* Max value is 2048, Data width should be 4 in decimal */
- r->data_width = 4;
r->alloc_capable = true;
@@ -290,7 +281,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
r->cache.cbm_len = eax.split.cbm_len + 1;
r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->cache.shareable_bits = ebx & r->default_ctrl;
- r->data_width = (r->cache.cbm_len + 3) / 4;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
r->alloc_capable = true;
@@ -786,20 +776,6 @@ static int resctrl_arch_offline_cpu(unsigned int cpu)
return 0;
}
-/*
- * Choose a width for the resource name and resource data based on the
- * resource that has widest name and cbm.
- */
-static __init void rdt_init_padding(void)
-{
- struct rdt_resource *r;
-
- for_each_alloc_capable_rdt_resource(r) {
- if (r->data_width > max_data_width)
- max_data_width = r->data_width;
- }
-}
-
enum {
RDT_FLAG_CMT,
RDT_FLAG_MBM_TOTAL,
@@ -1102,8 +1078,6 @@ static int __init resctrl_late_init(void)
if (!get_rdt_resources())
return -ENODEV;
- rdt_init_padding();
-
state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
"x86/resctrl/cat:online:",
resctrl_arch_online_cpu,
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index c763cb4fb1a8..59610b209b4e 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -487,8 +487,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
ctrl_val = resctrl_arch_get_config(r, dom, closid,
schema->conf_type);
- seq_printf(s, schema->fmt_str, dom->hdr.id, max_data_width,
- ctrl_val);
+ seq_printf(s, schema->fmt_str, dom->hdr.id, ctrl_val);
sep = true;
}
seq_puts(s, "\n");
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index b5543bd506c3..f975cd6cfe61 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -326,7 +326,7 @@ struct rdtgroup {
/* List of all resource groups */
extern struct list_head rdt_all_groups;
-extern int max_name_width, max_data_width;
+extern int max_name_width;
int __init rdtgroup_init(void);
void __exit rdtgroup_exit(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e7862d0936c9..1e0bae1a9d95 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -57,6 +57,12 @@ static struct kernfs_node *kn_mongrp;
/* Kernel fs node for "mon_data" directory under root */
static struct kernfs_node *kn_mondata;
+/*
+ * Used to store the max resource name width to display the schemata names in
+ * a tabular format.
+ */
+int max_name_width;
+
static struct seq_buf last_cmd_status;
static char last_cmd_status_buf[512];
@@ -2613,10 +2619,10 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
switch (r->schema_fmt) {
case RESCTRL_SCHEMA_BITMAP:
- s->fmt_str = "%d=%0*x";
+ s->fmt_str = "%d=%x";
break;
case RESCTRL_SCHEMA_RANGE:
- s->fmt_str = "%d=%0*u";
+ s->fmt_str = "%d=%u";
break;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 547f47065096..41eee6377a0f 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -214,7 +214,6 @@ enum resctrl_schema_fmt {
* @ctrl_domains: RCU list of all control domains for this resource
* @mon_domains: RCU list of all monitor domains for this resource
* @name: Name to use in "schemata" file.
- * @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
@@ -232,7 +231,6 @@ struct rdt_resource {
struct list_head ctrl_domains;
struct list_head mon_domains;
char *name;
- int data_width;
u32 default_ctrl;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 06/42] x86/resctrl: Remove data_width and the tabular format
2025-02-07 18:17 ` [PATCH v6 06/42] x86/resctrl: Remove data_width and the tabular format James Morse
@ 2025-02-19 21:56 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 21:56 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> The resctrl architecture code provides a data_width for the controls of
> each resource. This is used to zero pad all control values in the schemata
> file so they appear in columns. The same is done with the resource names
> to complete the visual effect. e.g.
> | SMBA:0=2048
> | L3:0=00ff
>
> AMD platforms discover their maximum bandwidth for the MB resource from
> firmware, but hard-code the data_width to 4. If the maximum bandwidth
> requires more digits - the tabular format is silently broken.
> This is also broken when the mba_MBps mount option is used as the
> field width isn't updated. If new schema are added resctrl will need
> to be able to determine the maximum width. The benefit of this
> pretty-printing is questionable.
>
> Instead of handling runtime discovery of the data_width for AMD platforms,
> remove the feature. These fields are always zero padded so should be
> harmless to remove if the whole field has been treated as a number.
> In the above example, this would now look like this:
> | SMBA:0=2048
> | L3:0=ff
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 07/42] x86/resctrl: Add max_bw to struct resctrl_membw
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (5 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 06/42] x86/resctrl: Remove data_width and the tabular format James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 22:14 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
` (36 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
__rdt_get_mem_config_amd() and __get_mem_config_intel() both use
the default_ctrl property as a maximum value. This is because the
MBA schema works differently between these platforms. Doing this
complicates determining whether the default_ctrl property belongs
to the arch code, or can be derived from the schema format.
Deriving the maximum or default value from the schema format would
avoid the architecture code having to tell resctrl such obvious
things as the maximum percentage is 100, and the maximum bitmap
is all ones.
Maximum bandwidth is always going to vary per platform. Add
max_bw as a special case. This is currently used for the maximum
MBA percentage on Intel platforms, but can be removed from the
architecture code if 'percentage' becomes a schema format resctrl
supports directly.
This value isn't needed for other schema formats.
This will allow the default_ctrl to be generated from the schema
properties when it is needed.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Removed redundant setting of schema_fmt on AMD platforms.
* Fixed off by one in cbm_validate().
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 2 ++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 ++--
include/linux/resctrl.h | 2 ++
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 754fb65565ec..4504a12efc97 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -212,6 +212,7 @@ static __init bool __get_mem_config_intel(struct rdt_resource *r)
hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
+ r->membw.max_bw = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
if (ecx & MBA_IS_LINEAR) {
r->membw.delay_linear = true;
@@ -250,6 +251,7 @@ static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)
cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
hw_res->num_closid = edx + 1;
r->default_ctrl = 1 << eax;
+ r->membw.max_bw = 1 << eax;
/* AMD does not use delay */
r->membw.delay_linear = false;
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 59610b209b4e..23a01eaebd58 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -63,9 +63,9 @@ static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
return true;
}
- if (bw < r->membw.min_bw || bw > r->default_ctrl) {
+ if (bw < r->membw.min_bw || bw > r->membw.max_bw) {
rdt_last_cmd_printf("MB value %u out of range [%d,%d]\n",
- bw, r->membw.min_bw, r->default_ctrl);
+ bw, r->membw.min_bw, r->membw.max_bw);
return false;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 41eee6377a0f..cfe451ae6ded 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -165,6 +165,7 @@ enum membw_throttle_mode {
/**
* struct resctrl_membw - Memory bandwidth allocation related data
* @min_bw: Minimum memory bandwidth percentage user can request
+ * @max_bw: Maximum memory bandwidth value, used as the reset value
* @bw_gran: Granularity at which the memory bandwidth is allocated
* @delay_linear: True if memory B/W delay is in linear scale
* @arch_needs_linear: True if we can't configure non-linear resources
@@ -175,6 +176,7 @@ enum membw_throttle_mode {
*/
struct resctrl_membw {
u32 min_bw;
+ u32 max_bw;
u32 bw_gran;
u32 delay_linear;
bool arch_needs_linear;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 07/42] x86/resctrl: Add max_bw to struct resctrl_membw
2025-02-07 18:17 ` [PATCH v6 07/42] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
@ 2025-02-19 22:14 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 22:14 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> __rdt_get_mem_config_amd() and __get_mem_config_intel() both use
> the default_ctrl property as a maximum value. This is because the
> MBA schema works differently between these platforms. Doing this
> complicates determining whether the default_ctrl property belongs
> to the arch code, or can be derived from the schema format.
>
> Deriving the maximum or default value from the schema format would
> avoid the architecture code having to tell resctrl such obvious
> things as the maximum percentage is 100, and the maximum bitmap
> is all ones.
>
> Maximum bandwidth is always going to vary per platform. Add
> max_bw as a special case. This is currently used for the maximum
> MBA percentage on Intel platforms, but can be removed from the
> architecture code if 'percentage' becomes a schema format resctrl
> supports directly.
>
> This value isn't needed for other schema formats.
>
> This will allow the default_ctrl to be generated from the schema
> properties when it is needed.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (6 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 07/42] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 22:54 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 09/42] x86/resctrl: Add helper for setting CPU default properties James Morse
` (35 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The struct rdt_resource default_ctrl is used by both the architecture
code for resetting the hardware controls, and sometimes by the
filesystem code as the default value for the schema, unless the
bandwidth software controller is in use.
Having the default exposed by the architecture code causes unnecessary
duplication for each architecture as the default value must be specified,
but can be derived from other schema properties. Now that the
maximum bandwidth is explicitly described, resctrl can derive the default
value from the schema format and the other resource properties.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Rewrote commit message.
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 16 +++++++---------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 +++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 +++---
include/linux/resctrl.h | 19 +++++++++++++++++--
4 files changed, 30 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 4504a12efc97..6fd195b600b1 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -143,7 +143,10 @@ static inline void cache_alloc_hsw_probe(void)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
struct rdt_resource *r = &hw_res->r_resctrl;
- u64 max_cbm = BIT_ULL_MASK(20) - 1, l3_cbm_0;
+ u64 max_cbm, l3_cbm_0;
+
+ r->cache.cbm_len = 20;
+ max_cbm = resctrl_get_default_ctrl(r);
if (wrmsrl_safe(MSR_IA32_L3_CBM_BASE, max_cbm))
return;
@@ -155,8 +158,6 @@ static inline void cache_alloc_hsw_probe(void)
return;
hw_res->num_closid = 4;
- r->default_ctrl = max_cbm;
- r->cache.cbm_len = 20;
r->cache.shareable_bits = 0xc0000;
r->cache.min_cbm_bits = 2;
r->cache.arch_has_sparse_bitmasks = false;
@@ -211,7 +212,6 @@ static __init bool __get_mem_config_intel(struct rdt_resource *r)
cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
- r->default_ctrl = MAX_MBA_BW;
r->membw.max_bw = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
if (ecx & MBA_IS_LINEAR) {
@@ -250,7 +250,6 @@ static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)
cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
hw_res->num_closid = edx + 1;
- r->default_ctrl = 1 << eax;
r->membw.max_bw = 1 << eax;
/* AMD does not use delay */
@@ -281,8 +280,7 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
hw_res->num_closid = edx.split.cos_max + 1;
r->cache.cbm_len = eax.split.cbm_len + 1;
- r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
- r->cache.shareable_bits = ebx & r->default_ctrl;
+ r->cache.shareable_bits = ebx & resctrl_get_default_ctrl(r);
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
r->alloc_capable = true;
@@ -329,7 +327,7 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
return MAX_MBA_BW - bw;
pr_warn_once("Non Linear delay-bw map not supported but queried\n");
- return r->default_ctrl;
+ return resctrl_get_default_ctrl(r);
}
static void mba_wrmsr_intel(struct msr_param *m)
@@ -438,7 +436,7 @@ static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
* For Memory Allocation: Set b/w requested to 100%
*/
for (i = 0; i < hw_res->num_closid; i++, dc++)
- *dc = r->default_ctrl;
+ *dc = resctrl_get_default_ctrl(r);
}
static void ctrl_domain_free(struct rdt_hw_ctrl_domain *hw_dom)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 23a01eaebd58..5d87f279085f 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -113,8 +113,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
*/
static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
{
- unsigned long first_bit, zero_bit, val;
+ u32 supported_bits = BIT_MASK(r->cache.cbm_len) - 1;
unsigned int cbm_len = r->cache.cbm_len;
+ unsigned long first_bit, zero_bit, val;
int ret;
ret = kstrtoul(buf, 16, &val);
@@ -123,7 +124,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
return false;
}
- if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
+ if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
rdt_last_cmd_puts("Mask out of range\n");
return false;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1e0bae1a9d95..cd8f65c12124 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -978,7 +978,7 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
struct resctrl_schema *s = of->kn->parent->priv;
struct rdt_resource *r = s->res;
- seq_printf(seq, "%x\n", r->default_ctrl);
+ seq_printf(seq, "%x\n", resctrl_get_default_ctrl(r));
return 0;
}
@@ -2882,7 +2882,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
hw_dom = resctrl_to_arch_ctrl_dom(d);
for (i = 0; i < hw_res->num_closid; i++)
- hw_dom->ctrl_val[i] = r->default_ctrl;
+ hw_dom->ctrl_val[i] = resctrl_get_default_ctrl(r);
msr_param.dom = d;
smp_call_function_any(&d->hdr.cpu_mask, rdt_ctrl_update, &msr_param, 1);
}
@@ -3417,7 +3417,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
}
cfg = &d->staged_config[CDP_NONE];
- cfg->new_ctrl = r->default_ctrl;
+ cfg->new_ctrl = resctrl_get_default_ctrl(r);
cfg->have_new_ctrl = true;
}
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index cfe451ae6ded..a939c0cec7fe 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -216,7 +216,6 @@ enum resctrl_schema_fmt {
* @ctrl_domains: RCU list of all control domains for this resource
* @mon_domains: RCU list of all monitor domains for this resource
* @name: Name to use in "schemata" file.
- * @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
@@ -233,7 +232,6 @@ struct rdt_resource {
struct list_head ctrl_domains;
struct list_head mon_domains;
char *name;
- u32 default_ctrl;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
@@ -268,6 +266,23 @@ struct resctrl_schema {
u32 num_closid;
};
+/**
+ * resctrl_get_default_ctrl() - Return the default control value for this
+ * resource.
+ * @r: The resource whose default control type is queried.
+ */
+static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
+{
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ return BIT_MASK(r->cache.cbm_len) - 1;
+ case RESCTRL_SCHEMA_RANGE:
+ return r->membw.max_bw;
+ }
+
+ return WARN_ON_ONCE(1);
+}
+
/* The number of closid supported by this resource regardless of CDP */
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
u32 resctrl_arch_system_num_rmid_idx(void);
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it
2025-02-07 18:17 ` [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
@ 2025-02-19 22:54 ` Reinette Chatre
2025-02-28 19:55 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 22:54 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> The struct rdt_resource default_ctrl is used by both the architecture
> code for resetting the hardware controls, and sometimes by the
> filesystem code as the default value for the schema, unless the
> bandwidth software controller is in use.
>
> Having the default exposed by the architecture code causes unnecessary
> duplication for each architecture as the default value must be specified,
> but can be derived from other schema properties. Now that the
> maximum bandwidth is explicitly described, resctrl can derive the default
> value from the schema format and the other resource properties.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v5:
> * Rewrote commit message.
>
> Changes since v2:
> * This patch is new.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 16 +++++++---------
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 +++--
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 +++---
> include/linux/resctrl.h | 19 +++++++++++++++++--
> 4 files changed, 30 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 4504a12efc97..6fd195b600b1 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -143,7 +143,10 @@ static inline void cache_alloc_hsw_probe(void)
> {
> struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
> struct rdt_resource *r = &hw_res->r_resctrl;
> - u64 max_cbm = BIT_ULL_MASK(20) - 1, l3_cbm_0;
> + u64 max_cbm, l3_cbm_0;
> +
> + r->cache.cbm_len = 20;
> + max_cbm = resctrl_get_default_ctrl(r);
>
> if (wrmsrl_safe(MSR_IA32_L3_CBM_BASE, max_cbm))
> return;
> @@ -155,8 +158,6 @@ static inline void cache_alloc_hsw_probe(void)
> return;
>
> hw_res->num_closid = 4;
> - r->default_ctrl = max_cbm;
> - r->cache.cbm_len = 20;
> r->cache.shareable_bits = 0xc0000;
> r->cache.min_cbm_bits = 2;
> r->cache.arch_has_sparse_bitmasks = false;
> @@ -211,7 +212,6 @@ static __init bool __get_mem_config_intel(struct rdt_resource *r)
> cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
> hw_res->num_closid = edx.split.cos_max + 1;
> max_delay = eax.split.max_delay + 1;
> - r->default_ctrl = MAX_MBA_BW;
> r->membw.max_bw = MAX_MBA_BW;
> r->membw.arch_needs_linear = true;
> if (ecx & MBA_IS_LINEAR) {
> @@ -250,7 +250,6 @@ static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)
>
> cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
> hw_res->num_closid = edx + 1;
> - r->default_ctrl = 1 << eax;
> r->membw.max_bw = 1 << eax;
>
> /* AMD does not use delay */
> @@ -281,8 +280,7 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
> cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
> hw_res->num_closid = edx.split.cos_max + 1;
> r->cache.cbm_len = eax.split.cbm_len + 1;
> - r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
> - r->cache.shareable_bits = ebx & r->default_ctrl;
> + r->cache.shareable_bits = ebx & resctrl_get_default_ctrl(r);
> if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
> r->alloc_capable = true;
Using resctrl_get_default_ctrl() in the architecture code like this seems awkward in
the way that the caller depends on resctrl_get_default_ctrl() returning a bitmask, thus
requiring caller to be familiar with internals of function called.
> @@ -329,7 +327,7 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
> return MAX_MBA_BW - bw;
>
> pr_warn_once("Non Linear delay-bw map not supported but queried\n");
> - return r->default_ctrl;
> + return resctrl_get_default_ctrl(r);
I wonder if returning MAX_MBA_BW directly would not be more appropriate here ...
or returning r->membw.max_bw and doing so in previous patch.
> }
>
> static void mba_wrmsr_intel(struct msr_param *m)
> @@ -438,7 +436,7 @@ static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
> * For Memory Allocation: Set b/w requested to 100%
> */
> for (i = 0; i < hw_res->num_closid; i++, dc++)
> - *dc = r->default_ctrl;
> + *dc = resctrl_get_default_ctrl(r);
> }
>
> static void ctrl_domain_free(struct rdt_hw_ctrl_domain *hw_dom)
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 23a01eaebd58..5d87f279085f 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -113,8 +113,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
> */
> static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> {
> - unsigned long first_bit, zero_bit, val;
> + u32 supported_bits = BIT_MASK(r->cache.cbm_len) - 1;
What is criteria for caller to decide between using resctrl_get_default_ctrl() or
computing the bitmask self? Most callers already seem to be using
resctrl_get_default_ctrl() with clear expectation that it will return
a bitmask or not so it is not obvious why some callers needing bitmask
use resctrl_get_default_ctrl() while this caller compute bitmask self.
> unsigned int cbm_len = r->cache.cbm_len;
> + unsigned long first_bit, zero_bit, val;
> int ret;
>
> ret = kstrtoul(buf, 16, &val);
> @@ -123,7 +124,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> return false;
> }
>
> - if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
> + if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
> rdt_last_cmd_puts("Mask out of range\n");
> return false;
> }
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 1e0bae1a9d95..cd8f65c12124 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -978,7 +978,7 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
> struct resctrl_schema *s = of->kn->parent->priv;
> struct rdt_resource *r = s->res;
>
> - seq_printf(seq, "%x\n", r->default_ctrl);
> + seq_printf(seq, "%x\n", resctrl_get_default_ctrl(r));
> return 0;
> }
While the function is "rdt_default_ctrl_show()" the file is "cbm_mask"
and so here also resctrl_get_default_ctrl() is implicitly assumed to
return only a bitmask.
>
> @@ -2882,7 +2882,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
> hw_dom = resctrl_to_arch_ctrl_dom(d);
>
> for (i = 0; i < hw_res->num_closid; i++)
> - hw_dom->ctrl_val[i] = r->default_ctrl;
> + hw_dom->ctrl_val[i] = resctrl_get_default_ctrl(r);
> msr_param.dom = d;
> smp_call_function_any(&d->hdr.cpu_mask, rdt_ctrl_update, &msr_param, 1);
> }
> @@ -3417,7 +3417,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
> }
>
> cfg = &d->staged_config[CDP_NONE];
> - cfg->new_ctrl = r->default_ctrl;
> + cfg->new_ctrl = resctrl_get_default_ctrl(r);
> cfg->have_new_ctrl = true;
> }
> }
Using resctrl_get_default_ctrl() only seems appropriate when setting or staging
the register values where the value returned is not further manipulated with
assumptions regarding its format.
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index cfe451ae6ded..a939c0cec7fe 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -216,7 +216,6 @@ enum resctrl_schema_fmt {
> * @ctrl_domains: RCU list of all control domains for this resource
> * @mon_domains: RCU list of all monitor domains for this resource
> * @name: Name to use in "schemata" file.
> - * @default_ctrl: Specifies default cache cbm or memory B/W percent.
> * @schema_fmt: Which format string and parser is used for this schema.
> * @evt_list: List of monitoring events
> * @cdp_capable: Is the CDP feature available on this resource
> @@ -233,7 +232,6 @@ struct rdt_resource {
> struct list_head ctrl_domains;
> struct list_head mon_domains;
> char *name;
> - u32 default_ctrl;
> enum resctrl_schema_fmt schema_fmt;
> struct list_head evt_list;
> bool cdp_capable;
> @@ -268,6 +266,23 @@ struct resctrl_schema {
> u32 num_closid;
> };
>
> +/**
> + * resctrl_get_default_ctrl() - Return the default control value for this
> + * resource.
> + * @r: The resource whose default control type is queried.
> + */
> +static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
> +{
> + switch (r->schema_fmt) {
> + case RESCTRL_SCHEMA_BITMAP:
> + return BIT_MASK(r->cache.cbm_len) - 1;
> + case RESCTRL_SCHEMA_RANGE:
> + return r->membw.max_bw;
> + }
> +
> + return WARN_ON_ONCE(1);
> +}
> +
> /* The number of closid supported by this resource regardless of CDP */
> u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
> u32 resctrl_arch_system_num_rmid_idx(void);
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it
2025-02-19 22:54 ` Reinette Chatre
@ 2025-02-28 19:55 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:55 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 19/02/2025 22:54, Reinette Chatre wrote:
> On 2/7/25 10:17 AM, James Morse wrote:
>> The struct rdt_resource default_ctrl is used by both the architecture
>> code for resetting the hardware controls, and sometimes by the
>> filesystem code as the default value for the schema, unless the
>> bandwidth software controller is in use.
>>
>> Having the default exposed by the architecture code causes unnecessary
>> duplication for each architecture as the default value must be specified,
>> but can be derived from other schema properties. Now that the
>> maximum bandwidth is explicitly described, resctrl can derive the default
>> value from the schema format and the other resource properties.
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 4504a12efc97..6fd195b600b1 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -281,8 +280,7 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>> cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>> hw_res->num_closid = edx.split.cos_max + 1;
>> r->cache.cbm_len = eax.split.cbm_len + 1;
>> - r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>> - r->cache.shareable_bits = ebx & r->default_ctrl;
>> + r->cache.shareable_bits = ebx & resctrl_get_default_ctrl(r);
>> if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>> r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
>> r->alloc_capable = true;
> Using resctrl_get_default_ctrl() in the architecture code like this seems awkward in
> the way that the caller depends on resctrl_get_default_ctrl() returning a bitmask, thus
> requiring caller to be familiar with internals of function called.
resctrl and the arch code that provides the interface are closely coupled, so I don't
think its a problem to have to know what the call does...
Using the helper here was just because the memory for the default value has gone away,
I agree that as this is being used to manipulating stuff from cpuid, it should probably be
open coded here. I'll drop this hunk, (adding default_ctrl as a local variable here).
>> @@ -329,7 +327,7 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
>> return MAX_MBA_BW - bw;
>>
>> pr_warn_once("Non Linear delay-bw map not supported but queried\n");
>> - return r->default_ctrl;
>> + return resctrl_get_default_ctrl(r);
> I wonder if returning MAX_MBA_BW directly would not be more appropriate here ...
> or returning r->membw.max_bw and doing so in previous patch.
My thinking was this value is at least in the correct format for the hardware if an AMD
platform manages to get in here - but its only called from the Intel helpers, so that
can't happen.
Using MAX_MBA_BW makes it clearer what the 'Non Linear' warning is about.
>> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> index 23a01eaebd58..5d87f279085f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> @@ -113,8 +113,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
>> */
>> static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> {
>> - unsigned long first_bit, zero_bit, val;
>> + u32 supported_bits = BIT_MASK(r->cache.cbm_len) - 1;
> What is criteria for caller to decide between using resctrl_get_default_ctrl() or
> computing the bitmask self? Most callers already seem to be using
> resctrl_get_default_ctrl() with clear expectation that it will return
> a bitmask or not so it is not obvious why some callers needing bitmask
> use resctrl_get_default_ctrl() while this caller compute bitmask self.
This one case is odd because it also needs to know the number of bits in the bitmap. I
felt computing the bitmap directly here made it clearer this was checking against a single
set of properties. (if you diagree - lets change it!)
resctrl_get_default_ctrl() is largely intended to fit callers like reset_all_ctrls() which
don't know or care what the format is, as long as it matches what the hardware expects.
Most other cases are intending to parse or format the value, so they already know what
kind of thing its going to be.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 1e0bae1a9d95..cd8f65c12124 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -978,7 +978,7 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
>> struct resctrl_schema *s = of->kn->parent->priv;
>> struct rdt_resource *r = s->res;
>>
>> - seq_printf(seq, "%x\n", r->default_ctrl);
>> + seq_printf(seq, "%x\n", resctrl_get_default_ctrl(r));
>> return 0;
>> }
> While the function is "rdt_default_ctrl_show()" the file is "cbm_mask"
> and so here also resctrl_get_default_ctrl() is implicitly assumed to
> return only a bitmask.
Because the file is hidden behind RFTYPE_RES_CACHE, which can only be set on the L2 or L3
if they have bitmask controls. If we ever support other kinds of controls, we'd need to do
somthing about this.
I agree the existing name is misleading, but I don't think its worth the extra patch to
rename it. (Curious that it isn't called rdt_cbm_mask_show())
For the distant future, I have patches that propose adding a file to expose the schema
type to resctrl, then a bunch of files prefixed with "bitmap_" or "percent_" that describe
the control properties. What we have today is a mix of control type and resource all in
one - what does the "cbm_mask" mean if the resource is not a cache?
(it goes without saying that the existing files must stay with their same values)
>> @@ -3417,7 +3417,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
>> }
>>
>> cfg = &d->staged_config[CDP_NONE];
>> - cfg->new_ctrl = r->default_ctrl;
>> + cfg->new_ctrl = resctrl_get_default_ctrl(r);
>> cfg->have_new_ctrl = true;
>> }
>> }
> Using resctrl_get_default_ctrl() only seems appropriate when setting or staging
> the register values where the value returned is not further manipulated with
> assumptions regarding its format.
Provided the schema format has already been checked, I agree. This is what led to
cbm_validate() generating the bitmap values itself. (I take your point about the cpuid
interactions above)
In rdtgroup_init_mba() we could reach in to retrieve r->membw.max_bw, but the value isn't
being parsed or formatted, so I don't think it matters. Once we have a helper, we may as
well use it everywhere we can.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 09/42] x86/resctrl: Add helper for setting CPU default properties
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (7 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 08/42] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:09 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
` (34 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
rdtgroup_rmdir_ctrl() and rdtgroup_rmdir_mon() set the per-CPU
pqr_state for CPUs that were part of the rmdir()'d group.
Another architecture might not have a 'pqr_state', its hardware may
need the values in a different format. MPAM's equivalent of RMID values
are not unique, and always need the CLOSID to be provided too.
There is only one caller that modifies a single value,
(rdtgroup_rmdir_mon()). MPAM always needs both CLOSID and RMID
for the hardware value as these are written to the same system
register.
As rdtgroup_rmdir_mon() has the CLOSID on hand, only provide a
helper to set both values. These values are read by
__resctrl_sched_in(), but may be written by a different CPU without
any locking, add READ/WRTE_ONCE() to avoid torn values.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v1:
* In rdtgroup_rmdir_mon(), (re)set CPU default closid based on the
parent control group, to avoid the appearance of referencing
something that we're in the process of destroying (even if it
doesn't make a difference because the victim mon group necessarily
has the same closid as the parent control group).
Update comment to match.
No (intentional) functional change.
---
arch/x86/include/asm/resctrl.h | 14 +++++++++++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 20 ++++++++++++++------
2 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 8b1b6ce1e51b..6908cd0e6e40 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -4,8 +4,9 @@
#ifdef CONFIG_X86_CPU_RESCTRL
-#include <linux/sched.h>
#include <linux/jump_label.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
/*
* This value can never be a valid CLOSID, and is used when mapping a
@@ -96,8 +97,8 @@ static inline void resctrl_arch_disable_mon(void)
static inline void __resctrl_sched_in(struct task_struct *tsk)
{
struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
- u32 closid = state->default_closid;
- u32 rmid = state->default_rmid;
+ u32 closid = READ_ONCE(state->default_closid);
+ u32 rmid = READ_ONCE(state->default_rmid);
u32 tmp;
/*
@@ -132,6 +133,13 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
return val * scale;
}
+static inline void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid,
+ u32 rmid)
+{
+ WRITE_ONCE(per_cpu(pqr_state.default_closid, cpu), closid);
+ WRITE_ONCE(per_cpu(pqr_state.default_rmid, cpu), rmid);
+}
+
static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk,
u32 closid, u32 rmid)
{
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index cd8f65c12124..f706e5a288b1 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3731,14 +3731,21 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
{
struct rdtgroup *prdtgrp = rdtgrp->mon.parent;
+ u32 closid, rmid;
int cpu;
/* Give any tasks back to the parent group */
rdt_move_group_tasks(rdtgrp, prdtgrp, tmpmask);
- /* Update per cpu rmid of the moved CPUs first */
+ /*
+ * Update per cpu closid/rmid of the moved CPUs first.
+ * Note: the closid will not change, but the arch code still needs it.
+ */
+ closid = prdtgrp->closid;
+ rmid = prdtgrp->mon.rmid;
for_each_cpu(cpu, &rdtgrp->cpu_mask)
- per_cpu(pqr_state.default_rmid, cpu) = prdtgrp->mon.rmid;
+ resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
+
/*
* Update the MSR on moved CPUs and CPUs which have moved
* task running on them.
@@ -3771,6 +3778,7 @@ static int rdtgroup_ctrl_remove(struct rdtgroup *rdtgrp)
static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
{
+ u32 closid, rmid;
int cpu;
/* Give any tasks back to the default group */
@@ -3781,10 +3789,10 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
&rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
/* Update per cpu closid and rmid of the moved CPUs first */
- for_each_cpu(cpu, &rdtgrp->cpu_mask) {
- per_cpu(pqr_state.default_closid, cpu) = rdtgroup_default.closid;
- per_cpu(pqr_state.default_rmid, cpu) = rdtgroup_default.mon.rmid;
- }
+ closid = rdtgroup_default.closid;
+ rmid = rdtgroup_default.mon.rmid;
+ for_each_cpu(cpu, &rdtgrp->cpu_mask)
+ resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
/*
* Update the MSR on moved CPUs and CPUs which have moved
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 09/42] x86/resctrl: Add helper for setting CPU default properties
2025-02-07 18:17 ` [PATCH v6 09/42] x86/resctrl: Add helper for setting CPU default properties James Morse
@ 2025-02-19 23:09 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:09 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> rdtgroup_rmdir_ctrl() and rdtgroup_rmdir_mon() set the per-CPU
> pqr_state for CPUs that were part of the rmdir()'d group.
>
> Another architecture might not have a 'pqr_state', its hardware may
> need the values in a different format. MPAM's equivalent of RMID values
> are not unique, and always need the CLOSID to be provided too.
>
> There is only one caller that modifies a single value,
> (rdtgroup_rmdir_mon()). MPAM always needs both CLOSID and RMID
> for the hardware value as these are written to the same system
> register.
>
> As rdtgroup_rmdir_mon() has the CLOSID on hand, only provide a
> helper to set both values. These values are read by
> __resctrl_sched_in(), but may be written by a different CPU without
> any locking, add READ/WRTE_ONCE() to avoid torn values.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (8 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 09/42] x86/resctrl: Add helper for setting CPU default properties James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:13 ` Reinette Chatre
2025-02-27 20:25 ` Moger, Babu
2025-02-07 18:17 ` [PATCH v6 11/42] x86/resctrl: Expose resctrl fs's init function to the rest of the kernel James Morse
` (33 subsequent siblings)
43 siblings, 2 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
it uses to update the local CPUs default pqr values. This is a problem
once the resctrl parts move out to /fs/, as the arch code cannot
poke around inside struct rdtgroup.
Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
to be used as the target of an IPI, and pass the effective CLOSID
and RMID in a new struct.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v1:
* To clarify the meanings of the new helper and struct:
Rename resctrl_arch_sync_cpu_default() to
resctrl_arch_sync_cpu_closid_rmid();
Rename struct resctrl_cpu_sync to struct resctrl_cpu_defaults;
Flesh out the comment block in <linux/resctrl.h>.
No functional change.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++----
include/linux/resctrl.h | 22 ++++++++++++++++++++++
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f706e5a288b1..62d9a50c7bba 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -355,13 +355,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
* from update_closid_rmid() is protected against __switch_to() because
* preemption is disabled.
*/
-static void update_cpu_closid_rmid(void *info)
+void resctrl_arch_sync_cpu_closid_rmid(void *info)
{
- struct rdtgroup *r = info;
+ struct resctrl_cpu_defaults *r = info;
if (r) {
this_cpu_write(pqr_state.default_closid, r->closid);
- this_cpu_write(pqr_state.default_rmid, r->mon.rmid);
+ this_cpu_write(pqr_state.default_rmid, r->rmid);
}
/*
@@ -376,11 +376,20 @@ static void update_cpu_closid_rmid(void *info)
* Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
*
* Per task closids/rmids must have been set up before calling this function.
+ * @r may be NULL.
*/
static void
update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
{
- on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
+ struct resctrl_cpu_defaults defaults, *p = NULL;
+
+ if (r) {
+ defaults.closid = r->closid;
+ defaults.rmid = r->mon.rmid;
+ p = &defaults;
+ }
+
+ on_each_cpu_mask(cpu_mask, resctrl_arch_sync_cpu_closid_rmid, p, 1);
}
static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index a939c0cec7fe..da3b344d06d3 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -266,6 +266,28 @@ struct resctrl_schema {
u32 num_closid;
};
+struct resctrl_cpu_defaults {
+ u32 closid;
+ u32 rmid;
+};
+
+/**
+ * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
+ * Call via IPI.
+ * @info: If non-NULL, a pointer to a struct resctrl_cpu_defaults
+ * specifying the new CLOSID and RMID for tasks in the default
+ * resctrl ctrl and mon group when running on this CPU. If NULL,
+ * this CPU is not re-assigned to a different default group.
+ *
+ * Propagates reassignment of CPUs and/or tasks to different resctrl groups
+ * when requested by the resctrl core code.
+ *
+ * This function records the per-cpu defaults specified by @info (if any),
+ * and then reconfigures the CPU's hardware CLOSID and RMID for subsequent
+ * execution based on @current, in the same way as during a task switch.
+ */
+void resctrl_arch_sync_cpu_closid_rmid(void *info);
+
/**
* resctrl_get_default_ctrl() - Return the default control value for this
* resource.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
2025-02-07 18:17 ` [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
@ 2025-02-19 23:13 ` Reinette Chatre
2025-02-27 20:25 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:13 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
> it uses to update the local CPUs default pqr values. This is a problem
> once the resctrl parts move out to /fs/, as the arch code cannot
> poke around inside struct rdtgroup.
>
> Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
> to be used as the target of an IPI, and pass the effective CLOSID
> and RMID in a new struct.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
2025-02-07 18:17 ` [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
2025-02-19 23:13 ` Reinette Chatre
@ 2025-02-27 20:25 ` Moger, Babu
2025-02-28 19:54 ` James Morse
1 sibling, 1 reply; 135+ messages in thread
From: Moger, Babu @ 2025-02-27 20:25 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi James,
On 2/7/25 12:17, James Morse wrote:
> update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
> it uses to update the local CPUs default pqr values. This is a problem
> once the resctrl parts move out to /fs/, as the arch code cannot
> poke around inside struct rdtgroup.
>
> Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
> to be used as the target of an IPI, and pass the effective CLOSID
> and RMID in a new struct.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v1:
> * To clarify the meanings of the new helper and struct:
>
> Rename resctrl_arch_sync_cpu_default() to
> resctrl_arch_sync_cpu_closid_rmid();
>
> Rename struct resctrl_cpu_sync to struct resctrl_cpu_defaults;
>
> Flesh out the comment block in <linux/resctrl.h>.
>
> No functional change.
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++----
> include/linux/resctrl.h | 22 ++++++++++++++++++++++
> 2 files changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index f706e5a288b1..62d9a50c7bba 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -355,13 +355,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
> * from update_closid_rmid() is protected against __switch_to() because
> * preemption is disabled.
> */
> -static void update_cpu_closid_rmid(void *info)
> +void resctrl_arch_sync_cpu_closid_rmid(void *info)
> {
> - struct rdtgroup *r = info;
> + struct resctrl_cpu_defaults *r = info;
>
> if (r) {
> this_cpu_write(pqr_state.default_closid, r->closid);
> - this_cpu_write(pqr_state.default_rmid, r->mon.rmid);
> + this_cpu_write(pqr_state.default_rmid, r->rmid);
> }
>
> /*
> @@ -376,11 +376,20 @@ static void update_cpu_closid_rmid(void *info)
> * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
> *
> * Per task closids/rmids must have been set up before calling this function.
> + * @r may be NULL.
> */
> static void
> update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
> {
> - on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
> + struct resctrl_cpu_defaults defaults, *p = NULL;
> +
> + if (r) {
> + defaults.closid = r->closid;
> + defaults.rmid = r->mon.rmid;
> + p = &defaults;
> + }
> +
> + on_each_cpu_mask(cpu_mask, resctrl_arch_sync_cpu_closid_rmid, p, 1);
> }
>
> static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index a939c0cec7fe..da3b344d06d3 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -266,6 +266,28 @@ struct resctrl_schema {
> u32 num_closid;
> };
>
> +struct resctrl_cpu_defaults {
> + u32 closid;
> + u32 rmid;
> +};
> +
Can you please explain why this is part of resctrl.h?
Isn't this part of architecture specific definition?
> +/**
> + * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
> + * Call via IPI.
> + * @info: If non-NULL, a pointer to a struct resctrl_cpu_defaults
> + * specifying the new CLOSID and RMID for tasks in the default
> + * resctrl ctrl and mon group when running on this CPU. If NULL,
> + * this CPU is not re-assigned to a different default group.
> + *
> + * Propagates reassignment of CPUs and/or tasks to different resctrl groups
> + * when requested by the resctrl core code.
> + *
> + * This function records the per-cpu defaults specified by @info (if any),
> + * and then reconfigures the CPU's hardware CLOSID and RMID for subsequent
> + * execution based on @current, in the same way as during a task switch.
> + */
> +void resctrl_arch_sync_cpu_closid_rmid(void *info);
> +
> /**
> * resctrl_get_default_ctrl() - Return the default control value for this
> * resource.
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
2025-02-27 20:25 ` Moger, Babu
@ 2025-02-28 19:54 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:54 UTC (permalink / raw)
To: babu.moger, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Babu,
On 27/02/2025 20:25, Moger, Babu wrote:
> On 2/7/25 12:17, James Morse wrote:
>> update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
>> it uses to update the local CPUs default pqr values. This is a problem
>> once the resctrl parts move out to /fs/, as the arch code cannot
>> poke around inside struct rdtgroup.
>>
>> Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
>> to be used as the target of an IPI, and pass the effective CLOSID
>> and RMID in a new struct.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index a939c0cec7fe..da3b344d06d3 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -266,6 +266,28 @@ struct resctrl_schema {
>> u32 num_closid;
>> };
>>
>> +struct resctrl_cpu_defaults {
>> + u32 closid;
>> + u32 rmid;
>> +};
>> +
>
> Can you please explain why this is part of resctrl.h?
>
> Isn't this part of architecture specific definition?
update_closid_rmid() builds an on-stack copy of this, then IPIs each CPU to call
resctrl_arch_sync_cpu_closid_rmid(). Because of the IPI resctrl would need to invent an
identical structure.
If the filesystem and arch code use of this diverge it may be necessary to duplicate them,
(or declare one inside another), but its not needed today.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 11/42] x86/resctrl: Expose resctrl fs's init function to the rest of the kernel
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (9 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 10/42] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:15 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code James Morse
` (32 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
rdtgroup_init() needs exposing to the rest of the kernel so that arch
code can call it once it lives in core code. As this is one of the few
functions exposed, rename it to have "resctrl" in the name. The same
goes for the exit call.
Rename x86's arch code init functions for RDT to have an arch
prefix to make it clear these are part of the architecture code.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Tweaked the word 'export'
Changes since v4:
* Changed the voice of some of the commit message.
Changes since v1:
* Rename stale rdtgroup_init() to resctrl_init() in
arch/x86/kernel/cpu/resctrl/monitor.c comments.
No functional change.
* [Commit message only] Minor rewording to avoid "impersonating code".
* [Commit message only] Typo fix:
s/to have the resctrl/to have resctrl/ in commit message.
---
arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++------
arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++----
include/linux/resctrl.h | 3 +++
5 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6fd195b600b1..18e5f13bc4ae 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -1062,7 +1062,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
}
}
-static int __init resctrl_late_init(void)
+static int __init resctrl_arch_late_init(void)
{
struct rdt_resource *r;
int state, ret;
@@ -1085,7 +1085,7 @@ static int __init resctrl_late_init(void)
if (state < 0)
return state;
- ret = rdtgroup_init();
+ ret = resctrl_init();
if (ret) {
cpuhp_remove_state(state);
return ret;
@@ -1101,18 +1101,18 @@ static int __init resctrl_late_init(void)
return 0;
}
-late_initcall(resctrl_late_init);
+late_initcall(resctrl_arch_late_init);
-static void __exit resctrl_exit(void)
+static void __exit resctrl_arch_exit(void)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
cpuhp_remove_state(rdt_online);
- rdtgroup_exit();
+ resctrl_exit();
if (r->mon_capable)
rdt_put_mon_l3_config();
}
-__exitcall(resctrl_exit);
+__exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f975cd6cfe61..8291f1b59981 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -328,9 +328,6 @@ extern struct list_head rdt_all_groups;
extern int max_name_width;
-int __init rdtgroup_init(void);
-void __exit rdtgroup_exit(void);
-
/**
* struct rftype - describe each file in the resctrl file system
* @name: File name
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 58b5b21349a8..e8388d19a579 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1027,7 +1027,7 @@ static int dom_data_init(struct rdt_resource *r)
/*
* RESCTRL_RESERVED_CLOSID and RESCTRL_RESERVED_RMID are special and
* are always allocated. These are used for the rdtgroup_default
- * control group, which will be setup later in rdtgroup_init().
+ * control group, which will be setup later in resctrl_init().
*/
idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID,
RESCTRL_RESERVED_RMID);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 62d9a50c7bba..b2dad689e780 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4235,14 +4235,14 @@ void resctrl_offline_cpu(unsigned int cpu)
}
/*
- * rdtgroup_init - rdtgroup initialization
+ * resctrl_init - resctrl filesystem initialization
*
* Setup resctrl file system including set up root, create mount point,
- * register rdtgroup filesystem, and initialize files under root directory.
+ * register resctrl filesystem, and initialize files under root directory.
*
* Return: 0 on success or -errno
*/
-int __init rdtgroup_init(void)
+int __init resctrl_init(void)
{
int ret = 0;
@@ -4290,7 +4290,7 @@ int __init rdtgroup_init(void)
return ret;
}
-void __exit rdtgroup_exit(void)
+void __exit resctrl_exit(void)
{
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index da3b344d06d3..b74853f224f7 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -402,4 +402,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
+int __init resctrl_init(void);
+void __exit resctrl_exit(void);
+
#endif /* _RESCTRL_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 11/42] x86/resctrl: Expose resctrl fs's init function to the rest of the kernel
2025-02-07 18:17 ` [PATCH v6 11/42] x86/resctrl: Expose resctrl fs's init function to the rest of the kernel James Morse
@ 2025-02-19 23:15 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:15 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> rdtgroup_init() needs exposing to the rest of the kernel so that arch
> code can call it once it lives in core code. As this is one of the few
> functions exposed, rename it to have "resctrl" in the name. The same
> goes for the exit call.
>
> Rename x86's arch code init functions for RDT to have an arch
> prefix to make it clear these are part of the architecture code.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (10 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 11/42] x86/resctrl: Expose resctrl fs's init function to the rest of the kernel James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:24 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header James Morse
` (31 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
rdt_find_domain() finds a domain given a resource and a cache-id.
This is used by both the architecture code and the filesystem code.
After the filesystem code moves to live in /fs/, this helper will no
longer be visible.
Move it to the global header file. As its now globally visible, and
has only a handful of callers, swap the 'rdt' for 'resctrl'.
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
* This patch replaced one that split off the 'new entry to insert'
behaviour.
---
arch/x86/kernel/cpu/resctrl/core.c | 38 +++--------------------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
include/linux/resctrl.h | 34 ++++++++++++++++++++
4 files changed, 39 insertions(+), 37 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 18e5f13bc4ae..49a9ac0dd96c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -395,36 +395,6 @@ void rdt_ctrl_update(void *arg)
hw_res->msr_update(m);
}
-/*
- * rdt_find_domain - Search for a domain id in a resource domain list.
- *
- * Search the domain list to find the domain id. If the domain id is
- * found, return the domain. NULL otherwise. If the domain id is not
- * found (and NULL returned) then the first domain with id bigger than
- * the input id can be returned to the caller via @pos.
- */
-struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
- struct list_head **pos)
-{
- struct rdt_domain_hdr *d;
- struct list_head *l;
-
- list_for_each(l, h) {
- d = list_entry(l, struct rdt_domain_hdr, list);
- /* When id is found, return its domain. */
- if (id == d->id)
- return d;
- /* Stop searching when finding id's position in sorted list. */
- if (id < d->id)
- break;
- }
-
- if (pos)
- *pos = l;
-
- return NULL;
-}
-
static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -535,7 +505,7 @@ static void domain_add_cpu_ctrl(int cpu, struct rdt_resource *r)
return;
}
- hdr = rdt_find_domain(&r->ctrl_domains, id, &add_pos);
+ hdr = resctrl_find_domain(&r->ctrl_domains, id, &add_pos);
if (hdr) {
if (WARN_ON_ONCE(hdr->type != RESCTRL_CTRL_DOMAIN))
return;
@@ -590,7 +560,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
return;
}
- hdr = rdt_find_domain(&r->mon_domains, id, &add_pos);
+ hdr = resctrl_find_domain(&r->mon_domains, id, &add_pos);
if (hdr) {
if (WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN))
return;
@@ -655,7 +625,7 @@ static void domain_remove_cpu_ctrl(int cpu, struct rdt_resource *r)
return;
}
- hdr = rdt_find_domain(&r->ctrl_domains, id, NULL);
+ hdr = resctrl_find_domain(&r->ctrl_domains, id, NULL);
if (!hdr) {
pr_warn("Can't find control domain for id=%d for CPU %d for resource %s\n",
id, cpu, r->name);
@@ -701,7 +671,7 @@ static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r)
return;
}
- hdr = rdt_find_domain(&r->mon_domains, id, NULL);
+ hdr = resctrl_find_domain(&r->mon_domains, id, NULL);
if (!hdr) {
pr_warn("Can't find monitor domain for id=%d for CPU %d for resource %s\n",
id, cpu, r->name);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 5d87f279085f..7df98fda8a32 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -695,7 +695,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
* This file provides data from a single domain. Search
* the resource to find the domain with "domid".
*/
- hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
+ hdr = resctrl_find_domain(&r->mon_domains, domid, NULL);
if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
ret = -ENOENT;
goto out;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 8291f1b59981..da73404183da 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -581,8 +581,6 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn);
int rdtgroup_kn_mode_restrict(struct rdtgroup *r, const char *name);
int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
umode_t mask);
-struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
- struct list_head **pos);
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
int rdtgroup_schemata_show(struct kernfs_open_file *of,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index b74853f224f7..6cec088ae0d9 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
might_sleep();
}
+/**
+ * resctrl_find_domain() - Search for a domain id in a resource domain list.
+ * @h: The domain list to search.
+ * @id: The domain id to search for.
+ * @pos: A pointer to position in the list id should be inserted.
+ *
+ * Search the domain list to find the domain id. If the domain id is
+ * found, return the domain. NULL otherwise. If the domain id is not
+ * found (and NULL returned) then the first domain with id bigger than
+ * the input id can be returned to the caller via @pos.
+ */
+static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
+ int id,
+ struct list_head **pos)
+{
+ struct rdt_domain_hdr *d;
+ struct list_head *l;
+
+ list_for_each(l, h) {
+ d = list_entry(l, struct rdt_domain_hdr, list);
+ /* When id is found, return its domain. */
+ if (id == d->id)
+ return d;
+ /* Stop searching when finding id's position in sorted list. */
+ if (id < d->id)
+ break;
+ }
+
+ if (pos)
+ *pos = l;
+
+ return NULL;
+}
+
/**
* resctrl_arch_reset_rmid() - Reset any private state associated with rmid
* and eventid.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-07 18:17 ` [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code James Morse
@ 2025-02-19 23:24 ` Reinette Chatre
2025-02-20 10:58 ` Catalin Marinas
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:24 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> rdt_find_domain() finds a domain given a resource and a cache-id.
> This is used by both the architecture code and the filesystem code.
>
> After the filesystem code moves to live in /fs/, this helper will no
> longer be visible.
>
> Move it to the global header file. As its now globally visible, and
> has only a handful of callers, swap the 'rdt' for 'resctrl'.
>
> Signed-off-by: James Morse <james.morse@arm.com>
>
> ---
> Changes since v5:
> * This patch replaced one that split off the 'new entry to insert'
> behaviour.
> ---
...
> @@ -395,36 +395,6 @@ void rdt_ctrl_update(void *arg)
> hw_res->msr_update(m);
> }
>
> -/*
> - * rdt_find_domain - Search for a domain id in a resource domain list.
> - *
> - * Search the domain list to find the domain id. If the domain id is
> - * found, return the domain. NULL otherwise. If the domain id is not
> - * found (and NULL returned) then the first domain with id bigger than
> - * the input id can be returned to the caller via @pos.
> - */
> -struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
> - struct list_head **pos)
> -{
> - struct rdt_domain_hdr *d;
> - struct list_head *l;
> -
> - list_for_each(l, h) {
> - d = list_entry(l, struct rdt_domain_hdr, list);
> - /* When id is found, return its domain. */
> - if (id == d->id)
> - return d;
> - /* Stop searching when finding id's position in sorted list. */
> - if (id < d->id)
> - break;
> - }
> -
> - if (pos)
> - *pos = l;
> -
> - return NULL;
> -}
> -
> static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
> {
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
...
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
> might_sleep();
> }
>
> +/**
> + * resctrl_find_domain() - Search for a domain id in a resource domain list.
> + * @h: The domain list to search.
> + * @id: The domain id to search for.
> + * @pos: A pointer to position in the list id should be inserted.
> + *
> + * Search the domain list to find the domain id. If the domain id is
> + * found, return the domain. NULL otherwise. If the domain id is not
> + * found (and NULL returned) then the first domain with id bigger than
> + * the input id can be returned to the caller via @pos.
> + */
> +static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
> + int id,
> + struct list_head **pos)
Could you please provide a motivation for why this needs to be inline now?
> +{
> + struct rdt_domain_hdr *d;
> + struct list_head *l;
> +
> + list_for_each(l, h) {
> + d = list_entry(l, struct rdt_domain_hdr, list);
> + /* When id is found, return its domain. */
> + if (id == d->id)
> + return d;
> + /* Stop searching when finding id's position in sorted list. */
> + if (id < d->id)
> + break;
> + }
> +
> + if (pos)
> + *pos = l;
> +
> + return NULL;
> +}
> +
> /**
> * resctrl_arch_reset_rmid() - Reset any private state associated with rmid
> * and eventid.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-19 23:24 ` Reinette Chatre
@ 2025-02-20 10:58 ` Catalin Marinas
2025-02-20 16:01 ` Reinette Chatre
0 siblings, 1 reply; 135+ messages in thread
From: Catalin Marinas @ 2025-02-20 10:58 UTC (permalink / raw)
To: Reinette Chatre
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko, Shanker Donthineni
On Wed, Feb 19, 2025 at 03:24:06PM -0800, Reinette Chatre wrote:
> On 2/7/25 10:17 AM, James Morse wrote:
> > rdt_find_domain() finds a domain given a resource and a cache-id.
> > This is used by both the architecture code and the filesystem code.
> >
> > After the filesystem code moves to live in /fs/, this helper will no
> > longer be visible.
> >
> > Move it to the global header file. As its now globally visible, and
> > has only a handful of callers, swap the 'rdt' for 'resctrl'.
[...]
> > --- a/include/linux/resctrl.h
> > +++ b/include/linux/resctrl.h
> > @@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
> > might_sleep();
> > }
> >
> > +/**
> > + * resctrl_find_domain() - Search for a domain id in a resource domain list.
> > + * @h: The domain list to search.
> > + * @id: The domain id to search for.
> > + * @pos: A pointer to position in the list id should be inserted.
> > + *
> > + * Search the domain list to find the domain id. If the domain id is
> > + * found, return the domain. NULL otherwise. If the domain id is not
> > + * found (and NULL returned) then the first domain with id bigger than
> > + * the input id can be returned to the caller via @pos.
> > + */
> > +static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
> > + int id,
> > + struct list_head **pos)
>
> Could you please provide a motivation for why this needs to be inline now?
It's in a header now, to avoid the compiler complaining about unused
static functions wherever this file is included. The alternative is a
prototype declaration and the actual implementation in a .c file.
(drive-by comment, I don't really understand this subsystem to make a
meaningful contribution)
--
Catalin
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-20 10:58 ` Catalin Marinas
@ 2025-02-20 16:01 ` Reinette Chatre
2025-02-27 22:44 ` Fenghua Yu
2025-02-28 19:56 ` James Morse
0 siblings, 2 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 16:01 UTC (permalink / raw)
To: Catalin Marinas
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko, Shanker Donthineni
Hi Catalin,
On 2/20/25 2:58 AM, Catalin Marinas wrote:
> On Wed, Feb 19, 2025 at 03:24:06PM -0800, Reinette Chatre wrote:
>> On 2/7/25 10:17 AM, James Morse wrote:
>>> rdt_find_domain() finds a domain given a resource and a cache-id.
>>> This is used by both the architecture code and the filesystem code.
>>>
>>> After the filesystem code moves to live in /fs/, this helper will no
>>> longer be visible.
>>>
>>> Move it to the global header file. As its now globally visible, and
>>> has only a handful of callers, swap the 'rdt' for 'resctrl'.
> [...]
>>> --- a/include/linux/resctrl.h
>>> +++ b/include/linux/resctrl.h
>>> @@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
>>> might_sleep();
>>> }
>>>
>>> +/**
>>> + * resctrl_find_domain() - Search for a domain id in a resource domain list.
>>> + * @h: The domain list to search.
>>> + * @id: The domain id to search for.
>>> + * @pos: A pointer to position in the list id should be inserted.
>>> + *
>>> + * Search the domain list to find the domain id. If the domain id is
>>> + * found, return the domain. NULL otherwise. If the domain id is not
>>> + * found (and NULL returned) then the first domain with id bigger than
>>> + * the input id can be returned to the caller via @pos.
>>> + */
>>> +static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
>>> + int id,
>>> + struct list_head **pos)
>>
>> Could you please provide a motivation for why this needs to be inline now?
>
> It's in a header now, to avoid the compiler complaining about unused
> static functions wherever this file is included. The alternative is a
> prototype declaration and the actual implementation in a .c file.
resctrl_find_domain() is currently in a .c file (arch/x86/kernel/cpu/resctrl/core.c)
with a prototype declaration (in arch/x86/kernel/cpu/resctrl/internal.h). This patch
switches resctrl_find_domain() to be inline without a motivation.
After a fresh reading of "The inline disease" in Documentation/process/coding-style.rst
I do see a few red flags related to making this function inline. The function is certainly
larger than the rule of thumb of "3 lines" and considering the number of call sites I do
not see how bloating the kernel is justified.
>
> (drive-by comment, I don't really understand this subsystem to make a
> meaningful contribution)
>
Thanks for taking a look. The idea is not unique to resctrl.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-20 16:01 ` Reinette Chatre
@ 2025-02-27 22:44 ` Fenghua Yu
2025-02-28 19:56 ` James Morse
1 sibling, 0 replies; 135+ messages in thread
From: Fenghua Yu @ 2025-02-27 22:44 UTC (permalink / raw)
To: Reinette Chatre, Catalin Marinas
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko, Shanker Donthineni
Hi, James, Reinette,
On 2/20/25 08:01, Reinette Chatre wrote:
> Hi Catalin,
>
> On 2/20/25 2:58 AM, Catalin Marinas wrote:
>> On Wed, Feb 19, 2025 at 03:24:06PM -0800, Reinette Chatre wrote:
>>> On 2/7/25 10:17 AM, James Morse wrote:
>>>> rdt_find_domain() finds a domain given a resource and a cache-id.
>>>> This is used by both the architecture code and the filesystem code.
>>>>
>>>> After the filesystem code moves to live in /fs/, this helper will no
>>>> longer be visible.
>>>>
>>>> Move it to the global header file. As its now globally visible, and
>>>> has only a handful of callers, swap the 'rdt' for 'resctrl'.
>> [...]
>>>> --- a/include/linux/resctrl.h
>>>> +++ b/include/linux/resctrl.h
>>>> @@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
>>>> might_sleep();
>>>> }
>>>>
>>>> +/**
>>>> + * resctrl_find_domain() - Search for a domain id in a resource domain list.
>>>> + * @h: The domain list to search.
>>>> + * @id: The domain id to search for.
>>>> + * @pos: A pointer to position in the list id should be inserted.
>>>> + *
>>>> + * Search the domain list to find the domain id. If the domain id is
>>>> + * found, return the domain. NULL otherwise. If the domain id is not
>>>> + * found (and NULL returned) then the first domain with id bigger than
>>>> + * the input id can be returned to the caller via @pos.
>>>> + */
>>>> +static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
>>>> + int id,
>>>> + struct list_head **pos)
>>> Could you please provide a motivation for why this needs to be inline now?
>> It's in a header now, to avoid the compiler complaining about unused
>> static functions wherever this file is included. The alternative is a
>> prototype declaration and the actual implementation in a .c file.
> resctrl_find_domain() is currently in a .c file (arch/x86/kernel/cpu/resctrl/core.c)
> with a prototype declaration (in arch/x86/kernel/cpu/resctrl/internal.h). This patch
> switches resctrl_find_domain() to be inline without a motivation.
>
> After a fresh reading of "The inline disease" in Documentation/process/coding-style.rst
> I do see a few red flags related to making this function inline. The function is certainly
> larger than the rule of thumb of "3 lines" and considering the number of call sites I do
> not see how bloating the kernel is justified.
Agree with Reinette.
Plus, resctrl_find_domain() is only called during setup and CPU hot plug
which are not run time paths and won't impact run time performance.
inline doesn't help the performance but makes the kernel bigger.
I can see the function is moved from arch/x86/kernel/cpu/resctrl/core.c
and there is no corresponding fs/resctrl/core.c.
If your motivation is to avoid fs/resctrl/core.c (which is much small)
to have one less file and just host the function in .h, please consider
to create fs/resctrl/core.c and put the function in it and declare it in
the .h file. So there won't be inline issue any more.
>> (drive-by comment, I don't really understand this subsystem to make a
>> meaningful contribution)
>>
> Thanks for taking a look. The idea is not unique to resctrl.
>
> Reinette
>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
2025-02-20 16:01 ` Reinette Chatre
2025-02-27 22:44 ` Fenghua Yu
@ 2025-02-28 19:56 ` James Morse
1 sibling, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:56 UTC (permalink / raw)
To: Reinette Chatre, Catalin Marinas
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
Hi Reinette,
On 20/02/2025 16:01, Reinette Chatre wrote:
> On 2/20/25 2:58 AM, Catalin Marinas wrote:
>> On Wed, Feb 19, 2025 at 03:24:06PM -0800, Reinette Chatre wrote:
>>> On 2/7/25 10:17 AM, James Morse wrote:
>>>> rdt_find_domain() finds a domain given a resource and a cache-id.
>>>> This is used by both the architecture code and the filesystem code.
>>>>
>>>> After the filesystem code moves to live in /fs/, this helper will no
>>>> longer be visible.
>>>>
>>>> Move it to the global header file. As its now globally visible, and
>>>> has only a handful of callers, swap the 'rdt' for 'resctrl'.
>>>> --- a/include/linux/resctrl.h
>>>> +++ b/include/linux/resctrl.h
>>>> @@ -372,6 +372,40 @@ static inline void resctrl_arch_rmid_read_context_check(void)
>>>> might_sleep();
>>>> }
>>>>
>>>> +/**
>>>> + * resctrl_find_domain() - Search for a domain id in a resource domain list.
>>>> + * @h: The domain list to search.
>>>> + * @id: The domain id to search for.
>>>> + * @pos: A pointer to position in the list id should be inserted.
>>>> + *
>>>> + * Search the domain list to find the domain id. If the domain id is
>>>> + * found, return the domain. NULL otherwise. If the domain id is not
>>>> + * found (and NULL returned) then the first domain with id bigger than
>>>> + * the input id can be returned to the caller via @pos.
>>>> + */
>>>> +static inline struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h,
>>>> + int id,
>>>> + struct list_head **pos)
>>>
>>> Could you please provide a motivation for why this needs to be inline now?
>>
>> It's in a header now, to avoid the compiler complaining about unused
>> static functions wherever this file is included. The alternative is a
>> prototype declaration and the actual implementation in a .c file.
(it was this)
> resctrl_find_domain() is currently in a .c file (arch/x86/kernel/cpu/resctrl/core.c)
> with a prototype declaration (in arch/x86/kernel/cpu/resctrl/internal.h). This patch
> switches resctrl_find_domain() to be inline without a motivation.
Its not clear what side should own this function, both the architecture and filesystem
code need to make use of it. The majority of callers are in the arch code - but putting it
here means its duplicated between architectures.
Putting it in the filesystem code means calling out to it, and means the compiler can't
remove that extra NULL argument thing if its unused. Its also a hindrance to having the
arch code standalone - so we can in the future consider making the filesystem parts a
module. (Tony has picked at this and pointed out that kernfs not being exported is a
bigger problem). Its not needed now (and I haven't tried), but it seems a reasonable
direction of travel.
> After a fresh reading of "The inline disease" in Documentation/process/coding-style.rst
> I do see a few red flags related to making this function inline. The function is certainly
> larger than the rule of thumb of "3 lines"
The thing about rules of thumb ...
The first example I looked at is __task_state_index(), which is equally longer than this
rule of thumb.
> and considering the number of call sites I do not see how bloating the kernel is justified.
I think this is several orders of magnitude below the point that this is something to care
about. If this were inlined in put_user() or something lots of drivers call, I'd agree.
I'll move it to live in the filesystem code - that saves 36 bytes.
(I'm honestly surprised its measurable!)
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (11 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 12/42] x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:29 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 14/42] x86/resctrl: Add an arch helper to reset one resource James Morse
` (30 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
When resctrl is fully factored into core and per-arch code, each arch
will need to use some resctrl common definitions in order to define its
own specializations and helpers. Following conventional practice, it
would be desirable to put the dependent arch definitions in an
<asm/resctrl.h> header that is included by the common <linux/resctrl.h>
header. However, this can make it awkward to avoid a circular
dependency between <linux/resctrl.h> and the arch header.
To avoid such dependencies, move the affected common types and
constants into a new header that does not need to depend on
<linux/resctrl.h> or on the arch headers.
The same logic applies to the monitor-configuration defines, move these
too.
Some kind of enumeration for events is needed between the filesystem
and architecture code. Take the x86 definition as its convenient for
x86.
The definition of enum resctrl_event_id is needed to allow the
architecture code to define resctrl_arch_mon_ctx_alloc() and
resctrl_arch_mon_ctx_free().
The definition of enum resctrl_res_level is needed to allow the
architecture code to define resctrl_arch_set_cdp_enabled() and
resctrl_arch_get_cdp_enabled().
The bits for mbm_local_bytes_config et al are ABI, and must be the same
on all architectures. These are documented in
Documentation/arch/x86/resctrl.rst
The maintainers entry for these headers was missed when resctrl.h was
created. Add a wildcard entry to match both resctrl.h and
resctrl_types.h.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Change since v3:
* Added header include.
* Corrected lists in the commit message.
Changes since v2:
* Added to the commit message why each of these things is necessary.
* Moved the enum resctrl_conf_type back to resctrl.h - this week arm's
CDP emulation code gets away without this...
Changes since v1:
* [Commit message only] Rewrite commit message to clarify the the
rationale for refactoring the headers in this way.
---
MAINTAINERS | 1 +
arch/x86/include/asm/resctrl.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 24 ------------
include/linux/resctrl.h | 21 +---------
include/linux/resctrl_types.h | 54 ++++++++++++++++++++++++++
5 files changed, 57 insertions(+), 44 deletions(-)
create mode 100644 include/linux/resctrl_types.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 896a307fa065..314b9a2ebe20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19836,6 +19836,7 @@ S: Supported
F: Documentation/arch/x86/resctrl*
F: arch/x86/include/asm/resctrl.h
F: arch/x86/kernel/cpu/resctrl/
+F: include/linux/resctrl*.h
F: tools/testing/selftests/resctrl/
READ-COPY UPDATE (RCU)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 6908cd0e6e40..52f2326e2b1e 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -6,6 +6,7 @@
#include <linux/jump_label.h>
#include <linux/percpu.h>
+#include <linux/resctrl_types.h>
#include <linux/sched.h>
/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index da73404183da..5f3713fb2eaf 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -32,30 +32,6 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
-/* Reads to Local DRAM Memory */
-#define READS_TO_LOCAL_MEM BIT(0)
-
-/* Reads to Remote DRAM Memory */
-#define READS_TO_REMOTE_MEM BIT(1)
-
-/* Non-Temporal Writes to Local Memory */
-#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
-
-/* Non-Temporal Writes to Remote Memory */
-#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
-
-/* Reads to Local Memory the system identifies as "Slow Memory" */
-#define READS_TO_LOCAL_S_MEM BIT(4)
-
-/* Reads to Remote Memory the system identifies as "Slow Memory" */
-#define READS_TO_REMOTE_S_MEM BIT(5)
-
-/* Dirty Victims to All Types of Memory */
-#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
-
-/* Max event bits supported */
-#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
-
/**
* cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
* aren't marked nohz_full
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 6cec088ae0d9..74cfd48e69ee 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -6,6 +6,7 @@
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pid.h>
+#include <linux/resctrl_types.h>
/* CLOSID, RMID value used by the default control group */
#define RESCTRL_RESERVED_CLOSID 0
@@ -37,28 +38,8 @@ enum resctrl_conf_type {
CDP_DATA,
};
-enum resctrl_res_level {
- RDT_RESOURCE_L3,
- RDT_RESOURCE_L2,
- RDT_RESOURCE_MBA,
- RDT_RESOURCE_SMBA,
-
- /* Must be the last */
- RDT_NUM_RESOURCES,
-};
-
#define CDP_NUM_TYPES (CDP_DATA + 1)
-/*
- * Event IDs, the values match those used to program IA32_QM_EVTSEL before
- * reading IA32_QM_CTR on RDT systems.
- */
-enum resctrl_event_id {
- QOS_L3_OCCUP_EVENT_ID = 0x01,
- QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
- QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
-};
-
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @new_ctrl: new ctrl value to be loaded
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
new file mode 100644
index 000000000000..51c51a1aabfb
--- /dev/null
+++ b/include/linux/resctrl_types.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2024 Arm Ltd.
+ * Based on arch/x86/kernel/cpu/resctrl/internal.h
+ */
+
+#ifndef __LINUX_RESCTRL_TYPES_H
+#define __LINUX_RESCTRL_TYPES_H
+
+/* Reads to Local DRAM Memory */
+#define READS_TO_LOCAL_MEM BIT(0)
+
+/* Reads to Remote DRAM Memory */
+#define READS_TO_REMOTE_MEM BIT(1)
+
+/* Non-Temporal Writes to Local Memory */
+#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
+
+/* Non-Temporal Writes to Remote Memory */
+#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
+
+/* Reads to Local Memory the system identifies as "Slow Memory" */
+#define READS_TO_LOCAL_S_MEM BIT(4)
+
+/* Reads to Remote Memory the system identifies as "Slow Memory" */
+#define READS_TO_REMOTE_S_MEM BIT(5)
+
+/* Dirty Victims to All Types of Memory */
+#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
+
+/* Max event bits supported */
+#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
+
+enum resctrl_res_level {
+ RDT_RESOURCE_L3,
+ RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,
+
+ /* Must be the last */
+ RDT_NUM_RESOURCES,
+};
+
+/*
+ * Event IDs, the values match those used to program IA32_QM_EVTSEL before
+ * reading IA32_QM_CTR on RDT systems.
+ */
+enum resctrl_event_id {
+ QOS_L3_OCCUP_EVENT_ID = 0x01,
+ QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
+ QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
+};
+
+#endif /* __LINUX_RESCTRL_TYPES_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header
2025-02-07 18:17 ` [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header James Morse
@ 2025-02-19 23:29 ` Reinette Chatre
2025-02-28 19:51 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:29 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> When resctrl is fully factored into core and per-arch code, each arch
> will need to use some resctrl common definitions in order to define its
> own specializations and helpers. Following conventional practice, it
> would be desirable to put the dependent arch definitions in an
> <asm/resctrl.h> header that is included by the common <linux/resctrl.h>
> header. However, this can make it awkward to avoid a circular
> dependency between <linux/resctrl.h> and the arch header.
>
> To avoid such dependencies, move the affected common types and
> constants into a new header that does not need to depend on
> <linux/resctrl.h> or on the arch headers.
>
> The same logic applies to the monitor-configuration defines, move these
> too.
>
> Some kind of enumeration for events is needed between the filesystem
> and architecture code. Take the x86 definition as its convenient for
> x86.
>
> The definition of enum resctrl_event_id is needed to allow the
> architecture code to define resctrl_arch_mon_ctx_alloc() and
> resctrl_arch_mon_ctx_free().
>
> The definition of enum resctrl_res_level is needed to allow the
> architecture code to define resctrl_arch_set_cdp_enabled() and
> resctrl_arch_get_cdp_enabled().
>
> The bits for mbm_local_bytes_config et al are ABI, and must be the same
> on all architectures. These are documented in
> Documentation/arch/x86/resctrl.rst
>
> The maintainers entry for these headers was missed when resctrl.h was
> created. Add a wildcard entry to match both resctrl.h and
> resctrl_types.h.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
...
> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
> new file mode 100644
> index 000000000000..51c51a1aabfb
> --- /dev/null
> +++ b/include/linux/resctrl_types.h
> @@ -0,0 +1,54 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2024 Arm Ltd.
Please note year.
> + * Based on arch/x86/kernel/cpu/resctrl/internal.h
> + */
> +
> +#ifndef __LINUX_RESCTRL_TYPES_H
> +#define __LINUX_RESCTRL_TYPES_H
> +
> +/* Reads to Local DRAM Memory */
> +#define READS_TO_LOCAL_MEM BIT(0)
> +
> +/* Reads to Remote DRAM Memory */
> +#define READS_TO_REMOTE_MEM BIT(1)
> +
> +/* Non-Temporal Writes to Local Memory */
> +#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
> +
> +/* Non-Temporal Writes to Remote Memory */
> +#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
> +
> +/* Reads to Local Memory the system identifies as "Slow Memory" */
> +#define READS_TO_LOCAL_S_MEM BIT(4)
> +
> +/* Reads to Remote Memory the system identifies as "Slow Memory" */
> +#define READS_TO_REMOTE_S_MEM BIT(5)
> +
> +/* Dirty Victims to All Types of Memory */
> +#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
> +
> +/* Max event bits supported */
> +#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
> +
> +enum resctrl_res_level {
> + RDT_RESOURCE_L3,
> + RDT_RESOURCE_L2,
> + RDT_RESOURCE_MBA,
> + RDT_RESOURCE_SMBA,
> +
> + /* Must be the last */
> + RDT_NUM_RESOURCES,
> +};
> +
> +/*
> + * Event IDs, the values match those used to program IA32_QM_EVTSEL before
> + * reading IA32_QM_CTR on RDT systems.
> + */
> +enum resctrl_event_id {
> + QOS_L3_OCCUP_EVENT_ID = 0x01,
> + QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
> + QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
> +};
> +
> +#endif /* __LINUX_RESCTRL_TYPES_H */
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header
2025-02-19 23:29 ` Reinette Chatre
@ 2025-02-28 19:51 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:51 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 19/02/2025 23:29, Reinette Chatre wrote:
> On 2/7/25 10:17 AM, James Morse wrote:
>> When resctrl is fully factored into core and per-arch code, each arch
>> will need to use some resctrl common definitions in order to define its
>> own specializations and helpers. Following conventional practice, it
>> would be desirable to put the dependent arch definitions in an
>> <asm/resctrl.h> header that is included by the common <linux/resctrl.h>
>> header. However, this can make it awkward to avoid a circular
>> dependency between <linux/resctrl.h> and the arch header.
>>
>> To avoid such dependencies, move the affected common types and
>> constants into a new header that does not need to depend on
>> <linux/resctrl.h> or on the arch headers.
>>
>> The same logic applies to the monitor-configuration defines, move these
>> too.
>>
>> Some kind of enumeration for events is needed between the filesystem
>> and architecture code. Take the x86 definition as its convenient for
>> x86.
>>
>> The definition of enum resctrl_event_id is needed to allow the
>> architecture code to define resctrl_arch_mon_ctx_alloc() and
>> resctrl_arch_mon_ctx_free().
>>
>> The definition of enum resctrl_res_level is needed to allow the
>> architecture code to define resctrl_arch_set_cdp_enabled() and
>> resctrl_arch_get_cdp_enabled().
>>
>> The bits for mbm_local_bytes_config et al are ABI, and must be the same
>> on all architectures. These are documented in
>> Documentation/arch/x86/resctrl.rst
>>
>> The maintainers entry for these headers was missed when resctrl.h was
>> created. Add a wildcard entry to match both resctrl.h and
>> resctrl_types.h.
>> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
>> new file mode 100644
>> index 000000000000..51c51a1aabfb
>> --- /dev/null
>> +++ b/include/linux/resctrl_types.h
>> @@ -0,0 +1,54 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright (C) 2024 Arm Ltd.
>
> Please note year.
I've changed it.
[...]
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 14/42] x86/resctrl: Add an arch helper to reset one resource
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (12 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 13/42] x86/resctrl: Move resctrl types to a separate header James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:32 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 15/42] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
` (29 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Tony Luck
On umount(), resctrl resets each resource back to its default
configuration. It only ever does this for all resources in one go.
reset_all_ctrls() is architecture specific as it works with struct
rdt_hw_resource.
Make reset_all_ctrls() an arch helper that resets one resource.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Move the arch/fs split into the for-each loop at Reinette's suggestion.
* Dropped a bunch of tags and rewrote the commit message.
Changes since v1:
* Rename the for_each_capable_rdt_resource() introduced in the new
function resctrl_arch_reset_resources(), back to
for_each_alloc_capable_rdt_resource() as it was in the original code.
The change looked unintentional; and presumably a resource that does
not support resource allocation doesn't have any properties to
reset...
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++++----
include/linux/resctrl.h | 9 +++++++++
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b2dad689e780..9eb57ebb36c6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2867,7 +2867,7 @@ static int rdt_init_fs_context(struct fs_context *fc)
return 0;
}
-static int reset_all_ctrls(struct rdt_resource *r)
+void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
struct rdt_hw_ctrl_domain *hw_dom;
@@ -2896,7 +2896,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
smp_call_function_any(&d->hdr.cpu_mask, rdt_ctrl_update, &msr_param, 1);
}
- return 0;
+ return;
}
/*
@@ -3015,9 +3015,10 @@ static void rdt_kill_sb(struct super_block *sb)
rdt_disable_ctx();
- /*Put everything back to default values. */
+ /* Put everything back to default values. */
for_each_alloc_capable_rdt_resource(r)
- reset_all_ctrls(r);
+ resctrl_arch_reset_all_ctrls(r);
+
rmdir_all_sub();
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 74cfd48e69ee..4444bc05e39c 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -414,6 +414,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
*/
void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d);
+/**
+ * resctrl_arch_reset_all_ctrls() - Reset the control for each CLOSID to its
+ * default.
+ * @r: The resctrl resource to reset.
+ *
+ * This can be called from any CPU.
+ */
+void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
+
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 14/42] x86/resctrl: Add an arch helper to reset one resource
2025-02-07 18:17 ` [PATCH v6 14/42] x86/resctrl: Add an arch helper to reset one resource James Morse
@ 2025-02-19 23:32 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:32 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> On umount(), resctrl resets each resource back to its default
> configuration. It only ever does this for all resources in one go.
>
> reset_all_ctrls() is architecture specific as it works with struct
> rdt_hw_resource.
>
> Make reset_all_ctrls() an arch helper that resets one resource.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 15/42] x86/resctrl: Move monitor exit work to a resctrl exit call
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (13 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 14/42] x86/resctrl: Add an arch helper to reset one resource James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:38 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 16/42] x86/resctrl: Move monitor init work to a resctrl init call James Morse
` (28 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
rdt_put_mon_l3_config() is called via the architecture's
resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
and closid_num_dirty_rmid[] arrays. In reality this code is marked
__exit, and is removed by the linker as resctrl can't be built
as a module.
To separate the filesystem and architecture parts of resctrl,
this free()ing work needs to be triggered by the filesystem,
as these structures belong to the filesystem code.
Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
and call it from resctrl_exit(). The kfree() is currently
dependent on r->mon_capable.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v4:
* Added __exit so it can be removed in the next patch.
Changes since v3:
* Moved r->mon_capable check under the lock.
* Dropped references to resctrl_mon_resource_init() from the commit message.
* Fixed more resctrl typos,
Changes since v2:
* Dropped __exit as needed in the next patch.
Change since v1:
* [Commit message only] Typo fixes:
s/restrl/resctrl/g
s/resctl/resctrl/g
* [Commit message only] Reword second paragraph to remove reference to
the MPAM error interrupt, which provides background rationale for a
later patch rather than for this patch, and so it is not really
relevant here.
---
arch/x86/kernel/cpu/resctrl/core.c | 5 -----
arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 12 +++++++++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 ++
4 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 49a9ac0dd96c..b9c4a4e40e35 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -1075,14 +1075,9 @@ late_initcall(resctrl_arch_late_init);
static void __exit resctrl_arch_exit(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
-
cpuhp_remove_state(rdt_online);
resctrl_exit();
-
- if (r->mon_capable)
- rdt_put_mon_l3_config();
}
__exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5f3713fb2eaf..73005ca2dda1 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -586,7 +586,7 @@ void closid_free(int closid);
int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
-void __exit rdt_put_mon_l3_config(void);
+void __exit resctrl_mon_resource_exit(void);
bool __init rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index e8388d19a579..15e8c0190bfc 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1040,10 +1040,13 @@ static int dom_data_init(struct rdt_resource *r)
return err;
}
-static void __exit dom_data_exit(void)
+static void __exit dom_data_exit(struct rdt_resource *r)
{
mutex_lock(&rdtgroup_mutex);
+ if (!r->mon_capable)
+ goto out_unlock;
+
if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
kfree(closid_num_dirty_rmid);
closid_num_dirty_rmid = NULL;
@@ -1052,6 +1055,7 @@ static void __exit dom_data_exit(void)
kfree(rmid_ptrs);
rmid_ptrs = NULL;
+out_unlock:
mutex_unlock(&rdtgroup_mutex);
}
@@ -1237,9 +1241,11 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
return 0;
}
-void __exit rdt_put_mon_l3_config(void)
+void __exit resctrl_mon_resource_exit(void)
{
- dom_data_exit();
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+ dom_data_exit(r);
}
void __init intel_rdt_mbm_apply_quirk(void)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9eb57ebb36c6..42c48e79364d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4296,4 +4296,6 @@ void __exit resctrl_exit(void)
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
sysfs_remove_mount_point(fs_kobj, "resctrl");
+
+ resctrl_mon_resource_exit();
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 15/42] x86/resctrl: Move monitor exit work to a resctrl exit call
2025-02-07 18:17 ` [PATCH v6 15/42] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
@ 2025-02-19 23:38 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:38 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> rdt_put_mon_l3_config() is called via the architecture's
> resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
> and closid_num_dirty_rmid[] arrays. In reality this code is marked
> __exit, and is removed by the linker as resctrl can't be built
> as a module.
>
> To separate the filesystem and architecture parts of resctrl,
> this free()ing work needs to be triggered by the filesystem,
> as these structures belong to the filesystem code.
>
> Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
> and call it from resctrl_exit(). The kfree() is currently
> dependent on r->mon_capable.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 16/42] x86/resctrl: Move monitor init work to a resctrl init call
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (14 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 15/42] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:43 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 17/42] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
` (27 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
rdt_get_mon_l3_config() is called from the architecture's
resctrl_arch_late_init(), and initialises both architecture specific
fields, such as hw_res->mon_scale and resctrl filesystem fields
by calling dom_data_init().
To separate the filesystem and architecture parts of resctrl, this
function needs splitting up.
Add resctrl_mon_resource_init() to do the filesystem specific work,
and call it from resctrl_init(). This runs later, but is still before
the filesystem is mounted and the rmid_ptrs[] array can be used.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v4:
* Removed __exit markers
Changes since v3:
* Added a comment over resctrl_mon_resource_init().
* Added a comment over domain_setup_mon_state() to warn of cpuhp ordering.
* Added __init to resctrl_mon_resource_init().
Changes since v2:
* Added error handling for the case sysfs files can't be created.
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 40 ++++++++++++++++++++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 22 +++++++++++++-
3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 73005ca2dda1..70fbb902e85e 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -586,13 +586,14 @@ void closid_free(int closid);
int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
-void __exit resctrl_mon_resource_exit(void);
+void resctrl_mon_resource_exit(void);
bool __init rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
cpumask_t *cpumask, int evtid, int first);
+int __init resctrl_mon_resource_init(void);
void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
unsigned long delay_ms,
int exclude_cpu);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 15e8c0190bfc..1730ba814834 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1040,7 +1040,7 @@ static int dom_data_init(struct rdt_resource *r)
return err;
}
-static void __exit dom_data_exit(struct rdt_resource *r)
+static void dom_data_exit(struct rdt_resource *r)
{
mutex_lock(&rdtgroup_mutex);
@@ -1176,12 +1176,40 @@ static __init int snc_get_config(void)
return ret;
}
+/**
+ * resctrl_mon_resource_init() - Initialise global monitoring structures.
+ *
+ * Allocate and initialise global monitor resources that do not belong to a
+ * specific domain. i.e. the rmid_ptrs[] used for the limbo and free lists.
+ * Called once during boot after the struct rdt_resource's have been configured
+ * but before the filesystem is mounted.
+ * Resctrl's cpuhp callbacks may be called before this point to bring a domain
+ * online.
+ *
+ * Returns 0 for success, or -ENOMEM.
+ */
+int __init resctrl_mon_resource_init(void)
+{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+ int ret;
+
+ if (!r->mon_capable)
+ return 0;
+
+ ret = dom_data_init(r);
+ if (ret)
+ return ret;
+
+ l3_mon_evt_init(r);
+
+ return 0;
+}
+
int __init rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
unsigned int threshold;
- int ret;
snc_nodes_per_l3_cache = snc_get_config();
@@ -1211,10 +1239,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
*/
resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
- ret = dom_data_init(r);
- if (ret)
- return ret;
-
if (rdt_cpu_has(X86_FEATURE_BMEC)) {
u32 eax, ebx, ecx, edx;
@@ -1234,14 +1258,12 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
}
}
- l3_mon_evt_init(r);
-
r->mon_capable = true;
return 0;
}
-void __exit resctrl_mon_resource_exit(void)
+void resctrl_mon_resource_exit(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 42c48e79364d..badac3f5da72 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4102,6 +4102,19 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
mutex_unlock(&rdtgroup_mutex);
}
+/**
+ * domain_setup_mon_state() - Initialise domain monitoring structures.
+ * @r: The resource for the newly online domain.
+ * @d: The newly online domain.
+ *
+ * Allocate monitor resources that belong to this domain.
+ * Called when the first CPU of a domain comes online, regardless of whether
+ * the filesystem is mounted.
+ * During boot this may be called before global allocations have been made by
+ * resctrl_mon_resource_init().
+ *
+ * Returns 0 for success, or -ENOMEM.
+ */
static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain *d)
{
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
@@ -4252,9 +4265,15 @@ int __init resctrl_init(void)
rdtgroup_setup_default();
+ ret = resctrl_mon_resource_init();
+ if (ret)
+ return ret;
+
ret = sysfs_create_mount_point(fs_kobj, "resctrl");
- if (ret)
+ if (ret) {
+ resctrl_mon_resource_exit();
return ret;
+ }
ret = register_filesystem(&rdt_fs_type);
if (ret)
@@ -4287,6 +4306,7 @@ int __init resctrl_init(void)
cleanup_mountpoint:
sysfs_remove_mount_point(fs_kobj, "resctrl");
+ resctrl_mon_resource_exit();
return ret;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 16/42] x86/resctrl: Move monitor init work to a resctrl init call
2025-02-07 18:17 ` [PATCH v6 16/42] x86/resctrl: Move monitor init work to a resctrl init call James Morse
@ 2025-02-19 23:43 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:43 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> rdt_get_mon_l3_config() is called from the architecture's
> resctrl_arch_late_init(), and initialises both architecture specific
> fields, such as hw_res->mon_scale and resctrl filesystem fields
> by calling dom_data_init().
>
> To separate the filesystem and architecture parts of resctrl, this
> function needs splitting up.
>
> Add resctrl_mon_resource_init() to do the filesystem specific work,
> and call it from resctrl_init(). This runs later, but is still before
> the filesystem is mounted and the rmid_ptrs[] array can be used.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 17/42] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (15 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 16/42] x86/resctrl: Move monitor init work to a resctrl init call James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:47 ` Reinette Chatre
2025-02-07 18:17 ` [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
` (26 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The for_each_*_rdt_resource() helpers walk the architecture's array
of structures, using the resctrl visible part as an iterator. These
became over-complex when the structures were split into a
filesystem and architecture-specific struct. This approach avoided
the need to touch every call site, and was done before there was a
helper to retrieve a resource by rid.
Once the filesystem parts of resctrl are moved to /fs/, both the
architecture's resource array, and the definition of those structures
is no longer accessible. To support resctrl, each architecture would
have to provide equally complex macros.
Rewrite the macro to make use of resctrl_arch_get_resource(), and
move these to include/linux/resctrl.h so existing x86 arch code continues
to use them.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Fixed an off by one in for_each_rdt_resource().
* Added header file path to commit message.
Changes since v3:
* Restructure the existing macros instead of open-coding the for loop.
Changes since v1:
* [Whitespace only] Fix bogus whitespace introduced in
rdtgroup_create_info_dir().
* [Commit message only] Typo fix:
s/architectures/architecture's/g
---
arch/x86/kernel/cpu/resctrl/internal.h | 29 --------------------------
include/linux/resctrl.h | 18 ++++++++++++++++
2 files changed, 18 insertions(+), 29 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 70fbb902e85e..82dbc1606663 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -475,14 +475,6 @@ extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
extern enum resctrl_event_id mba_mbps_default_event;
-static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
-{
- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
-
- hw_res++;
- return &hw_res->r_resctrl;
-}
-
static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
{
return rdt_resources_all[l].cdp_enabled;
@@ -492,27 +484,6 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
-/*
- * To return the common struct rdt_resource, which is contained in struct
- * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
- */
-#define for_each_rdt_resource(r) \
- for (r = &rdt_resources_all[0].r_resctrl; \
- r <= &rdt_resources_all[RDT_NUM_RESOURCES - 1].r_resctrl; \
- r = resctrl_inc(r))
-
-#define for_each_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->alloc_capable || r->mon_capable)
-
-#define for_each_alloc_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->alloc_capable)
-
-#define for_each_mon_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->mon_capable)
-
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
union cpuid_0x10_1_eax {
struct {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4444bc05e39c..686d33d67456 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -26,6 +26,24 @@ int proc_resctrl_show(struct seq_file *m,
/* max value for struct rdt_domain's mbps_val */
#define MBA_MAX_MBPS U32_MAX
+/* Walk all possible resources, with variants for only controls or monitors. */
+#define for_each_rdt_resource(_r) \
+ for ((_r) = resctrl_arch_get_resource(0); \
+ (_r) && (_r)->rid < RDT_NUM_RESOURCES; \
+ (_r) = resctrl_arch_get_resource((_r)->rid + 1))
+
+#define for_each_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->alloc_capable || (r)->mon_capable)
+
+#define for_each_alloc_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->alloc_capable)
+
+#define for_each_mon_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->mon_capable)
+
/**
* enum resctrl_conf_type - The type of configuration.
* @CDP_NONE: No prioritisation, both code and data are controlled or monitored.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 17/42] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2025-02-07 18:17 ` [PATCH v6 17/42] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
@ 2025-02-19 23:47 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:47 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> The for_each_*_rdt_resource() helpers walk the architecture's array
> of structures, using the resctrl visible part as an iterator. These
> became over-complex when the structures were split into a
> filesystem and architecture-specific struct. This approach avoided
> the need to touch every call site, and was done before there was a
> helper to retrieve a resource by rid.
>
> Once the filesystem parts of resctrl are moved to /fs/, both the
> architecture's resource array, and the definition of those structures
> is no longer accessible. To support resctrl, each architecture would
> have to provide equally complex macros.
>
> Rewrite the macro to make use of resctrl_arch_get_resource(), and
> move these to include/linux/resctrl.h so existing x86 arch code continues
> to use them.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (16 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 17/42] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
@ 2025-02-07 18:17 ` James Morse
2025-02-19 23:55 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
` (25 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:17 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The architecture specific parts of resctrl provide helpers like
is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses
to the rdt_mon_features bitmap.
Exposing a group of helpers between the architecture and filesystem code
is preferable to a single unsigned-long like rdt_mon_features. Helpers
can be more readable and have a well defined behaviour, while allowing
architectures to hide more complex behaviour.
Once the filesystem parts of resctrl are moved, these existing helpers can
no longer live in internal.h. Move them to include/linux/resctrl.h
Once these are exposed to the wider kernel, they should have a
'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
Move and rename the helpers that touch rdt_mon_features directly.
is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
so can be moved into that file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Removed the word 'export' due to its kernel-specific meaning.
* Reworded commit message.
---
arch/x86/include/asm/resctrl.h | 16 +++++++++
arch/x86/kernel/cpu/resctrl/core.c | 8 ++---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 +--
arch/x86/kernel/cpu/resctrl/internal.h | 27 ---------------
arch/x86/kernel/cpu/resctrl/monitor.c | 19 ++++++-----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++--------
6 files changed, 59 insertions(+), 55 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 52f2326e2b1e..6d4c7ea2c9e3 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -42,6 +42,7 @@ DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
extern bool rdt_alloc_capable;
extern bool rdt_mon_capable;
+extern unsigned int rdt_mon_features;
DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
@@ -81,6 +82,21 @@ static inline void resctrl_arch_disable_mon(void)
static_branch_dec_cpuslocked(&rdt_enable_key);
}
+static inline bool resctrl_arch_is_llc_occupancy_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_total_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_local_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
+}
+
/*
* __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
*
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index b9c4a4e40e35..7d14d80b3f94 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -454,13 +454,13 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *hw_dom)
{
size_t tsize;
- if (is_mbm_total_enabled()) {
+ if (resctrl_arch_is_mbm_total_enabled()) {
tsize = sizeof(*hw_dom->arch_mbm_total);
hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL);
if (!hw_dom->arch_mbm_total)
return -ENOMEM;
}
- if (is_mbm_local_enabled()) {
+ if (resctrl_arch_is_mbm_local_enabled()) {
tsize = sizeof(*hw_dom->arch_mbm_local);
hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL);
if (!hw_dom->arch_mbm_local) {
@@ -909,9 +909,9 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
- else if (is_mbm_total_enabled())
+ else if (resctrl_arch_is_mbm_total_enabled())
mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
return !rdt_get_mon_l3_config(r);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 7df98fda8a32..a93b40ea0bad 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -559,12 +559,12 @@ ssize_t rdtgroup_mba_mbps_event_write(struct kernfs_open_file *of,
rdt_last_cmd_clear();
if (!strcmp(buf, "mbm_local_bytes")) {
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
rdtgrp->mba_mbps_event = QOS_L3_MBM_LOCAL_EVENT_ID;
else
ret = -EINVAL;
} else if (!strcmp(buf, "mbm_total_bytes")) {
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
rdtgrp->mba_mbps_event = QOS_L3_MBM_TOTAL_EVENT_ID;
else
ret = -EINVAL;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 82dbc1606663..4a5996d1e060 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -156,7 +156,6 @@ struct rmid_read {
void *arch_mon_ctx;
};
-extern unsigned int rdt_mon_features;
extern struct list_head resctrl_schema_all;
extern bool resctrl_mounted;
@@ -406,32 +405,6 @@ struct msr_param {
u32 high;
};
-static inline bool is_llc_occupancy_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
-}
-
-static inline bool is_mbm_total_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
-}
-
-static inline bool is_mbm_local_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
-}
-
-static inline bool is_mbm_enabled(void)
-{
- return (is_mbm_total_enabled() || is_mbm_local_enabled());
-}
-
-static inline bool is_mbm_event(int e)
-{
- return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
- e <= QOS_L3_MBM_LOCAL_EVENT_ID);
-}
-
/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
* @r_resctrl: Attributes of the resource used directly by resctrl.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 1730ba814834..b7d93670ed94 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -295,11 +295,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
{
struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
memset(hw_dom->arch_mbm_total, 0,
sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
memset(hw_dom->arch_mbm_local, 0,
sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
}
@@ -569,7 +569,7 @@ void free_rmid(u32 closid, u32 rmid)
entry = __rmid_entry(idx);
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
add_rmid_to_limbo(entry);
else
list_add_tail(&entry->list, &rmid_free_lru);
@@ -761,6 +761,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
struct rdtgroup *entry;
u32 cur_bw, user_bw;
+ if (!resctrl_arch_is_mbm_local_enabled())
+ return;
+
r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
evt_id = rgrp->mba_mbps_event;
@@ -852,10 +855,10 @@ static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
* This is protected from concurrent reads from user as both
* the user and overflow handler hold the global mutex.
*/
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
}
@@ -1085,11 +1088,11 @@ static void l3_mon_evt_init(struct rdt_resource *r)
{
INIT_LIST_HEAD(&r->evt_list);
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
list_add_tail(&llc_occupancy_event.list, &r->evt_list);
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
list_add_tail(&mbm_total_event.list, &r->evt_list);
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
list_add_tail(&mbm_local_event.list, &r->evt_list);
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index badac3f5da72..eb32fbc3abea 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -117,6 +117,18 @@ void rdt_staged_configs_clear(void)
}
}
+static bool resctrl_is_mbm_enabled(void)
+{
+ return (resctrl_arch_is_mbm_total_enabled() ||
+ resctrl_arch_is_mbm_local_enabled());
+}
+
+static bool resctrl_is_mbm_event(int e)
+{
+ return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
+ e <= QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
/*
* Trivial allocator for CLOSIDs. Since h/w only supports a small number,
* we can keep a bitmap of free CLOSIDs in a single integer.
@@ -164,7 +176,7 @@ static int closid_alloc(void)
lockdep_assert_held(&rdtgroup_mutex);
if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
- is_llc_occupancy_enabled()) {
+ resctrl_arch_is_llc_occupancy_enabled()) {
cleanest_closid = resctrl_find_cleanest_closid();
if (cleanest_closid < 0)
return cleanest_closid;
@@ -2378,7 +2390,7 @@ static bool supports_mba_mbps(void)
struct rdt_resource *rmbm = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
- return (is_mbm_enabled() &&
+ return (resctrl_is_mbm_enabled() &&
r->alloc_capable && is_mba_linear() &&
r->ctrl_scope == rmbm->mon_scope);
}
@@ -2756,7 +2768,7 @@ static int rdt_get_tree(struct fs_context *fc)
if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable())
resctrl_mounted = true;
- if (is_mbm_enabled()) {
+ if (resctrl_is_mbm_enabled()) {
r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
list_for_each_entry(dom, &r->mon_domains, hdr.list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
@@ -3125,7 +3137,7 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
if (ret)
return ret;
- if (!do_sum && is_mbm_event(mevt->evtid))
+ if (!do_sum && resctrl_is_mbm_event(mevt->evtid))
mon_event_read(&rr, r, d, prgrp, &d->hdr.cpu_mask, mevt->evtid, true);
}
@@ -4082,9 +4094,9 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
if (resctrl_mounted && resctrl_arch_mon_capable())
rmdir_mondata_subdir_allrdtgrp(r, d);
- if (is_mbm_enabled())
+ if (resctrl_is_mbm_enabled())
cancel_delayed_work(&d->mbm_over);
- if (is_llc_occupancy_enabled() && has_busy_rmid(d)) {
+ if (resctrl_arch_is_llc_occupancy_enabled() && has_busy_rmid(d)) {
/*
* When a package is going down, forcefully
* decrement rmid->ebusy. There is no way to know
@@ -4120,12 +4132,12 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
size_t tsize;
- if (is_llc_occupancy_enabled()) {
+ if (resctrl_arch_is_llc_occupancy_enabled()) {
d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL);
if (!d->rmid_busy_llc)
return -ENOMEM;
}
- if (is_mbm_total_enabled()) {
+ if (resctrl_arch_is_mbm_total_enabled()) {
tsize = sizeof(*d->mbm_total);
d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL);
if (!d->mbm_total) {
@@ -4133,7 +4145,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
return -ENOMEM;
}
}
- if (is_mbm_local_enabled()) {
+ if (resctrl_arch_is_mbm_local_enabled()) {
tsize = sizeof(*d->mbm_local);
d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL);
if (!d->mbm_local) {
@@ -4172,13 +4184,13 @@ int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d)
if (err)
goto out_unlock;
- if (is_mbm_enabled()) {
+ if (resctrl_is_mbm_enabled()) {
INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL,
RESCTRL_PICK_ANY_CPU);
}
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo);
/*
@@ -4233,12 +4245,12 @@ void resctrl_offline_cpu(unsigned int cpu)
d = get_mon_domain_from_cpu(cpu, l3);
if (d) {
- if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
+ if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
mbm_setup_overflow_handler(d, 0, cpu);
}
- if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
- has_busy_rmid(d)) {
+ if (resctrl_arch_is_llc_occupancy_enabled() &&
+ cpu == d->cqm_work_cpu && has_busy_rmid(d)) {
cancel_delayed_work(&d->cqm_limbo);
cqm_setup_limbo_handler(d, 0, cpu);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h
2025-02-07 18:17 ` [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
@ 2025-02-19 23:55 ` Reinette Chatre
2025-02-28 19:55 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-19 23:55 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:17 AM, James Morse wrote:
> The architecture specific parts of resctrl provide helpers like
> is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses
> to the rdt_mon_features bitmap.
>
> Exposing a group of helpers between the architecture and filesystem code
> is preferable to a single unsigned-long like rdt_mon_features. Helpers
> can be more readable and have a well defined behaviour, while allowing
> architectures to hide more complex behaviour.
>
> Once the filesystem parts of resctrl are moved, these existing helpers can
> no longer live in internal.h. Move them to include/linux/resctrl.h
> Once these are exposed to the wider kernel, they should have a
> 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
>
> Move and rename the helpers that touch rdt_mon_features directly.
> is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
> so can be moved into that file.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
...
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 1730ba814834..b7d93670ed94 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
...
> @@ -761,6 +761,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> struct rdtgroup *entry;
> u32 cur_bw, user_bw;
>
> + if (!resctrl_arch_is_mbm_local_enabled())
> + return;
> +
> r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
> evt_id = rgrp->mba_mbps_event;
>
Please drop this hunk. A new [1] resctrl feature makes it possible for software
controller to work with local as well as total bandwidth events.
Reinette
[1] https://lore.kernel.org/all/20241206163148.83828-1-tony.luck@intel.com/
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h
2025-02-19 23:55 ` Reinette Chatre
@ 2025-02-28 19:55 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:55 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 19/02/2025 23:55, Reinette Chatre wrote:
> On 2/7/25 10:17 AM, James Morse wrote:
>> The architecture specific parts of resctrl provide helpers like
>> is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses
>> to the rdt_mon_features bitmap.
>>
>> Exposing a group of helpers between the architecture and filesystem code
>> is preferable to a single unsigned-long like rdt_mon_features. Helpers
>> can be more readable and have a well defined behaviour, while allowing
>> architectures to hide more complex behaviour.
>>
>> Once the filesystem parts of resctrl are moved, these existing helpers can
>> no longer live in internal.h. Move them to include/linux/resctrl.h
>> Once these are exposed to the wider kernel, they should have a
>> 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
>>
>> Move and rename the helpers that touch rdt_mon_features directly.
>> is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
>> so can be moved into that file.
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 1730ba814834..b7d93670ed94 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>
> ...
>
>> @@ -761,6 +761,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>> struct rdtgroup *entry;
>> u32 cur_bw, user_bw;
>>
>> + if (!resctrl_arch_is_mbm_local_enabled())
>> + return;
>> +
>> r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
>> evt_id = rgrp->mba_mbps_event;
>>
>
> Please drop this hunk. A new [1] resctrl feature makes it possible for software
> controller to work with local as well as total bandwidth events.
Thanks - that was evidently a rebase conflict I messed up!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (17 preceding siblings ...)
2025-02-07 18:17 ` [PATCH v6 18/42] x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:13 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
` (24 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
When BMEC is supported the resctrl event can be configured in a number
of ways. This depends on architecture support. rdt_get_mon_l3_config()
modifies the struct mon_evt and calls mbm_config_rftype_init() to create
the files that allow the configuration.
Splitting this into separate architecture and filesystem parts would
require the struct mon_evt and mbm_config_rftype_init() to be exposed.
Instead, add resctrl_arch_is_evt_configurable(), and use this from
resctrl_mon_resource_init() to initialise struct mon_evt and call
mbm_config_rftype_init().
resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
obviously benefit from being inlined. Putting it in core.c will allow
rdt_cpu_has() to eventually become static.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v4:
* Moved all the __init changes to a later patch now that the exit gubbins
comes first.
---
arch/x86/kernel/cpu/resctrl/core.c | 15 +++++++++++++++
arch/x86/kernel/cpu/resctrl/monitor.c | 22 +++++++++++-----------
include/linux/resctrl.h | 2 ++
3 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7d14d80b3f94..43a9291988d3 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -831,6 +831,21 @@ bool __init rdt_cpu_has(int flag)
return ret;
}
+bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+{
+ if (!rdt_cpu_has(X86_FEATURE_BMEC))
+ return false;
+
+ switch (evt) {
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ return rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL);
+ default:
+ return false;
+ }
+}
+
static __init bool get_mem_config(void)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_MBA];
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b7d93670ed94..ab8f33f2277e 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1205,6 +1205,17 @@ int __init resctrl_mon_resource_init(void)
l3_mon_evt_init(r);
+ if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
+ mbm_total_event.configurable = true;
+ resctrl_file_fflags_init("mbm_total_bytes_config",
+ RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
+ }
+ if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) {
+ mbm_local_event.configurable = true;
+ resctrl_file_fflags_init("mbm_local_bytes_config",
+ RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
+ }
+
return 0;
}
@@ -1248,17 +1259,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
/* Detect list of bandwidth sources that can be tracked */
cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
-
- if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
- mbm_total_event.configurable = true;
- resctrl_file_fflags_init("mbm_total_bytes_config",
- RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
- }
- if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
- mbm_local_event.configurable = true;
- resctrl_file_fflags_init("mbm_local_bytes_config",
- RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
- }
}
r->mon_capable = true;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 686d33d67456..5c7b9760b63a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -309,6 +309,8 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
u32 resctrl_arch_system_num_rmid_idx(void);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
+bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2025-02-07 18:18 ` [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
@ 2025-02-20 0:13 ` Reinette Chatre
2025-02-28 19:56 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:13 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> When BMEC is supported the resctrl event can be configured in a number
> of ways. This depends on architecture support. rdt_get_mon_l3_config()
> modifies the struct mon_evt and calls mbm_config_rftype_init() to create
> the files that allow the configuration.
>
> Splitting this into separate architecture and filesystem parts would
> require the struct mon_evt and mbm_config_rftype_init() to be exposed.
>
> Instead, add resctrl_arch_is_evt_configurable(), and use this from
> resctrl_mon_resource_init() to initialise struct mon_evt and call
> mbm_config_rftype_init().
> resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
> obviously benefit from being inlined. Putting it in core.c will allow
> rdt_cpu_has() to eventually become static.
>
Please replace all instances of mbm_config_rftype_init() with
resctrl_file_fflags_init().
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v4:
> * Moved all the __init changes to a later patch now that the exit gubbins
> comes first.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 15 +++++++++++++++
> arch/x86/kernel/cpu/resctrl/monitor.c | 22 +++++++++++-----------
> include/linux/resctrl.h | 2 ++
> 3 files changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 7d14d80b3f94..43a9291988d3 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -831,6 +831,21 @@ bool __init rdt_cpu_has(int flag)
> return ret;
> }
>
> +bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
I know resctrl is not consistent in this regard but I think that it would improve
resctrl quality if new additions follow guidance from Documentation/process/coding-style.rst
(see section 6.1) Function prototypes) to place storage class attribute
(__init) before return type.
> +{
> + if (!rdt_cpu_has(X86_FEATURE_BMEC))
> + return false;
> +
> + switch (evt) {
> + case QOS_L3_MBM_TOTAL_EVENT_ID:
> + return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
> + case QOS_L3_MBM_LOCAL_EVENT_ID:
> + return rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL);
> + default:
> + return false;
> + }
> +}
> +
> static __init bool get_mem_config(void)
> {
> struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_MBA];
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2025-02-20 0:13 ` Reinette Chatre
@ 2025-02-28 19:56 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:56 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 00:13, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> When BMEC is supported the resctrl event can be configured in a number
>> of ways. This depends on architecture support. rdt_get_mon_l3_config()
>> modifies the struct mon_evt and calls mbm_config_rftype_init() to create
>> the files that allow the configuration.
>>
>> Splitting this into separate architecture and filesystem parts would
>> require the struct mon_evt and mbm_config_rftype_init() to be exposed.
>>
>> Instead, add resctrl_arch_is_evt_configurable(), and use this from
>> resctrl_mon_resource_init() to initialise struct mon_evt and call
>> mbm_config_rftype_init().
>> resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
>> obviously benefit from being inlined. Putting it in core.c will allow
>> rdt_cpu_has() to eventually become static.
> Please replace all instances of mbm_config_rftype_init() with
> resctrl_file_fflags_init().
Fixed, sorry I didn't spot that.
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 7d14d80b3f94..43a9291988d3 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -831,6 +831,21 @@ bool __init rdt_cpu_has(int flag)
>> return ret;
>> }
>>
>> +bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
> I know resctrl is not consistent in this regard but I think that it would improve
> resctrl quality if new additions follow guidance from Documentation/process/coding-style.rst
> (see section 6.1) Function prototypes) to place storage class attribute
> (__init) before return type.
Done.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (18 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 19/42] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:37 ` Reinette Chatre
2025-02-27 20:26 ` Moger, Babu
2025-02-07 18:18 ` [PATCH v6 21/42] x86/resctrl: Move mba_mbps_default_event init to filesystem code James Morse
` (23 subsequent siblings)
43 siblings, 2 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
mon_event_config_{read,write}() are called via IPI and access model
specific registers to do their work.
To support another architecture, this needs abstracting.
Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
prefix, and move their struct mon_config_info parameter into
<linux/resctrl.h>. This allows another architecture to supply an
implementation of these.
As struct mon_config_info is now exposed globally, give it a 'resctrl_'
prefix. MPAM systems need access to the domain to do this work, add
the resource and domain to struct resctrl_mon_config_info.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Renamed info to config_info.
* Added description of which fields are read and written in the structure.
* Clarified comment about which CPU this is called on for both kinds of
reader.
Changes since v3:
* Added comments over the read/write helper to explain the type of the void
pointer.
Changes since v1:
* [Whitespace only] Re-tabbed struct resctrl_mon_config_info in
<linux/resctrl.h> to fit the prevailing style.
Non-functional change.
* [Commit message only] Reword to align with the actual naming of the
definitions and destination header file.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 46 +++++++++++++-------------
include/linux/resctrl.h | 31 +++++++++++++++++
2 files changed, 54 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index eb32fbc3abea..e7d1d8b6983d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1580,11 +1580,6 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
return ret;
}
-struct mon_config_info {
- u32 evtid;
- u32 mon_config;
-};
-
#define INVALID_CONFIG_INDEX UINT_MAX
/**
@@ -1609,31 +1604,32 @@ static inline unsigned int mon_event_config_index_get(u32 evtid)
}
}
-static void mon_event_config_read(void *info)
+void resctrl_arch_mon_event_config_read(void *_config_info)
{
- struct mon_config_info *mon_info = info;
+ struct resctrl_mon_config_info *config_info = _config_info;
unsigned int index;
u64 msrval;
- index = mon_event_config_index_get(mon_info->evtid);
+ index = mon_event_config_index_get(config_info->evtid);
if (index == INVALID_CONFIG_INDEX) {
- pr_warn_once("Invalid event id %d\n", mon_info->evtid);
+ pr_warn_once("Invalid event id %d\n", config_info->evtid);
return;
}
rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
/* Report only the valid event configuration bits */
- mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
+ config_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
}
-static void mondata_config_read(struct rdt_mon_domain *d, struct mon_config_info *mon_info)
+static void mondata_config_read(struct resctrl_mon_config_info *mon_info)
{
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_read, mon_info, 1);
+ smp_call_function_any(&mon_info->d->hdr.cpu_mask,
+ resctrl_arch_mon_event_config_read, mon_info, 1);
}
static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
{
- struct mon_config_info mon_info;
+ struct resctrl_mon_config_info mon_info;
struct rdt_mon_domain *dom;
bool sep = false;
@@ -1644,9 +1640,11 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid
if (sep)
seq_puts(s, ";");
- memset(&mon_info, 0, sizeof(struct mon_config_info));
+ memset(&mon_info, 0, sizeof(struct resctrl_mon_config_info));
+ mon_info.r = r;
+ mon_info.d = dom;
mon_info.evtid = evtid;
- mondata_config_read(dom, &mon_info);
+ mondata_config_read(&mon_info);
seq_printf(s, "%d=0x%02x", dom->hdr.id, mon_info.mon_config);
sep = true;
@@ -1679,30 +1677,32 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
return 0;
}
-static void mon_event_config_write(void *info)
+void resctrl_arch_mon_event_config_write(void *_config_info)
{
- struct mon_config_info *mon_info = info;
+ struct resctrl_mon_config_info *config_info = _config_info;
unsigned int index;
- index = mon_event_config_index_get(mon_info->evtid);
+ index = mon_event_config_index_get(config_info->evtid);
if (index == INVALID_CONFIG_INDEX) {
- pr_warn_once("Invalid event id %d\n", mon_info->evtid);
+ pr_warn_once("Invalid event id %d\n", config_info->evtid);
return;
}
- wrmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, 0);
+ wrmsr(MSR_IA32_EVT_CFG_BASE + index, config_info->mon_config, 0);
}
static void mbm_config_write_domain(struct rdt_resource *r,
struct rdt_mon_domain *d, u32 evtid, u32 val)
{
- struct mon_config_info mon_info = {0};
+ struct resctrl_mon_config_info mon_info = {0};
/*
* Read the current config value first. If both are the same then
* no need to write it again.
*/
+ mon_info.r = r;
+ mon_info.d = d;
mon_info.evtid = evtid;
- mondata_config_read(d, &mon_info);
+ mondata_config_read(&mon_info);
if (mon_info.mon_config == val)
return;
@@ -1714,7 +1714,7 @@ static void mbm_config_write_domain(struct rdt_resource *r,
* are scoped at the domain level. Writing any of these MSRs
* on one CPU is observed by all the CPUs in the domain.
*/
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_write,
+ smp_call_function_any(&d->hdr.cpu_mask, resctrl_arch_mon_event_config_write,
&mon_info, 1);
/*
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 5c7b9760b63a..59d944e139f8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -270,6 +270,13 @@ struct resctrl_cpu_defaults {
u32 rmid;
};
+struct resctrl_mon_config_info {
+ struct rdt_resource *r;
+ struct rdt_mon_domain *d;
+ u32 evtid;
+ u32 mon_config;
+};
+
/**
* resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
* Call via IPI.
@@ -311,6 +318,30 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+/**
+ * resctrl_arch_mon_event_config_write() - Write the config for an event.
+ * @config_info: struct resctrl_mon_config_info describing the resource, domain
+ * and event.
+ *
+ * Reads resource, domain and eventid from @config_info and writes the
+ * event config_info->mon_config into hardware.
+ *
+ * Called via IPI to reach a CPU that is a member of the specified domain.
+ */
+void resctrl_arch_mon_event_config_write(void *config_info);
+
+/**
+ * resctrl_arch_mon_event_config_read() - Read the config for an event.
+ * @config_info: struct resctrl_mon_config_info describing the resource, domain
+ * and event.
+ *
+ * Reads resource, domain and eventid from @config_info and reads the
+ * hardware config value into config_info->mon_config.
+ *
+ * Called via IPI to reach a CPU that is a member of the specified domain.
+ */
+void resctrl_arch_mon_event_config_read(void *config_info);
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2025-02-07 18:18 ` [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2025-02-20 0:37 ` Reinette Chatre
2025-02-27 20:26 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:37 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> mon_event_config_{read,write}() are called via IPI and access model
> specific registers to do their work.
>
> To support another architecture, this needs abstracting.
>
> Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
> prefix, and move their struct mon_config_info parameter into
> <linux/resctrl.h>. This allows another architecture to supply an
> implementation of these.
>
> As struct mon_config_info is now exposed globally, give it a 'resctrl_'
> prefix. MPAM systems need access to the domain to do this work, add
> the resource and domain to struct resctrl_mon_config_info.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2025-02-07 18:18 ` [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
2025-02-20 0:37 ` Reinette Chatre
@ 2025-02-27 20:26 ` Moger, Babu
2025-02-28 19:54 ` James Morse
1 sibling, 1 reply; 135+ messages in thread
From: Moger, Babu @ 2025-02-27 20:26 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi James,
On 2/7/25 12:18, James Morse wrote:
> mon_event_config_{read,write}() are called via IPI and access model
> specific registers to do their work.
>
> To support another architecture, this needs abstracting.
>
> Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
> prefix, and move their struct mon_config_info parameter into
> <linux/resctrl.h>. This allows another architecture to supply an
> implementation of these.
>
> As struct mon_config_info is now exposed globally, give it a 'resctrl_'
> prefix. MPAM systems need access to the domain to do this work, add
> the resource and domain to struct resctrl_mon_config_info.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v5:
> * Renamed info to config_info.
> * Added description of which fields are read and written in the structure.
> * Clarified comment about which CPU this is called on for both kinds of
> reader.
>
> Changes since v3:
> * Added comments over the read/write helper to explain the type of the void
> pointer.
>
> Changes since v1:
> * [Whitespace only] Re-tabbed struct resctrl_mon_config_info in
> <linux/resctrl.h> to fit the prevailing style.
>
> Non-functional change.
>
> * [Commit message only] Reword to align with the actual naming of the
> definitions and destination header file.
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 46 +++++++++++++-------------
> include/linux/resctrl.h | 31 +++++++++++++++++
> 2 files changed, 54 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index eb32fbc3abea..e7d1d8b6983d 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1580,11 +1580,6 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
> return ret;
> }
>
> -struct mon_config_info {
> - u32 evtid;
> - u32 mon_config;
> -};
> -
> #define INVALID_CONFIG_INDEX UINT_MAX
>
> /**
> @@ -1609,31 +1604,32 @@ static inline unsigned int mon_event_config_index_get(u32 evtid)
> }
> }
>
> -static void mon_event_config_read(void *info)
> +void resctrl_arch_mon_event_config_read(void *_config_info)
> {
> - struct mon_config_info *mon_info = info;
> + struct resctrl_mon_config_info *config_info = _config_info;
> unsigned int index;
> u64 msrval;
>
> - index = mon_event_config_index_get(mon_info->evtid);
> + index = mon_event_config_index_get(config_info->evtid);
> if (index == INVALID_CONFIG_INDEX) {
> - pr_warn_once("Invalid event id %d\n", mon_info->evtid);
> + pr_warn_once("Invalid event id %d\n", config_info->evtid);
> return;
> }
> rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
>
> /* Report only the valid event configuration bits */
> - mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
> + config_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
> }
>
> -static void mondata_config_read(struct rdt_mon_domain *d, struct mon_config_info *mon_info)
> +static void mondata_config_read(struct resctrl_mon_config_info *mon_info)
> {
> - smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_read, mon_info, 1);
> + smp_call_function_any(&mon_info->d->hdr.cpu_mask,
> + resctrl_arch_mon_event_config_read, mon_info, 1);
> }
>
> static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
> {
> - struct mon_config_info mon_info;
> + struct resctrl_mon_config_info mon_info;
> struct rdt_mon_domain *dom;
> bool sep = false;
>
> @@ -1644,9 +1640,11 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid
> if (sep)
> seq_puts(s, ";");
>
> - memset(&mon_info, 0, sizeof(struct mon_config_info));
> + memset(&mon_info, 0, sizeof(struct resctrl_mon_config_info));
> + mon_info.r = r;
> + mon_info.d = dom;
> mon_info.evtid = evtid;
> - mondata_config_read(dom, &mon_info);
> + mondata_config_read(&mon_info);
>
> seq_printf(s, "%d=0x%02x", dom->hdr.id, mon_info.mon_config);
> sep = true;
> @@ -1679,30 +1677,32 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
> return 0;
> }
>
> -static void mon_event_config_write(void *info)
> +void resctrl_arch_mon_event_config_write(void *_config_info)
> {
> - struct mon_config_info *mon_info = info;
> + struct resctrl_mon_config_info *config_info = _config_info;
> unsigned int index;
>
> - index = mon_event_config_index_get(mon_info->evtid);
> + index = mon_event_config_index_get(config_info->evtid);
> if (index == INVALID_CONFIG_INDEX) {
> - pr_warn_once("Invalid event id %d\n", mon_info->evtid);
> + pr_warn_once("Invalid event id %d\n", config_info->evtid);
> return;
> }
> - wrmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, 0);
> + wrmsr(MSR_IA32_EVT_CFG_BASE + index, config_info->mon_config, 0);
> }
>
> static void mbm_config_write_domain(struct rdt_resource *r,
> struct rdt_mon_domain *d, u32 evtid, u32 val)
> {
> - struct mon_config_info mon_info = {0};
> + struct resctrl_mon_config_info mon_info = {0};
>
> /*
> * Read the current config value first. If both are the same then
> * no need to write it again.
> */
> + mon_info.r = r;
> + mon_info.d = d;
> mon_info.evtid = evtid;
> - mondata_config_read(d, &mon_info);
> + mondata_config_read(&mon_info);
> if (mon_info.mon_config == val)
> return;
>
> @@ -1714,7 +1714,7 @@ static void mbm_config_write_domain(struct rdt_resource *r,
> * are scoped at the domain level. Writing any of these MSRs
> * on one CPU is observed by all the CPUs in the domain.
> */
> - smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_write,
> + smp_call_function_any(&d->hdr.cpu_mask, resctrl_arch_mon_event_config_write,
> &mon_info, 1);
>
> /*
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 5c7b9760b63a..59d944e139f8 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -270,6 +270,13 @@ struct resctrl_cpu_defaults {
> u32 rmid;
> };
>
> +struct resctrl_mon_config_info {
> + struct rdt_resource *r;
> + struct rdt_mon_domain *d;
> + u32 evtid;
> + u32 mon_config;
> +};
Isn't this architecture specific definition? Why is this in common
resctrl.h file.
> +
> /**
> * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
> * Call via IPI.
> @@ -311,6 +318,30 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
>
> bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
>
> +/**
> + * resctrl_arch_mon_event_config_write() - Write the config for an event.
> + * @config_info: struct resctrl_mon_config_info describing the resource, domain
> + * and event.
> + *
> + * Reads resource, domain and eventid from @config_info and writes the
> + * event config_info->mon_config into hardware.
> + *
> + * Called via IPI to reach a CPU that is a member of the specified domain.
> + */
> +void resctrl_arch_mon_event_config_write(void *config_info);
> +
> +/**
> + * resctrl_arch_mon_event_config_read() - Read the config for an event.
> + * @config_info: struct resctrl_mon_config_info describing the resource, domain
> + * and event.
> + *
> + * Reads resource, domain and eventid from @config_info and reads the
> + * hardware config value into config_info->mon_config.
> + *
> + * Called via IPI to reach a CPU that is a member of the specified domain.
> + */
> +void resctrl_arch_mon_event_config_read(void *config_info);
> +
> /*
> * Update the ctrl_val and apply this config right now.
> * Must be called on one of the domain's CPUs.
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2025-02-27 20:26 ` Moger, Babu
@ 2025-02-28 19:54 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:54 UTC (permalink / raw)
To: babu.moger, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Babu,
On 27/02/2025 20:26, Moger, Babu wrote:
> On 2/7/25 12:18, James Morse wrote:
>> mon_event_config_{read,write}() are called via IPI and access model
>> specific registers to do their work.
>>
>> To support another architecture, this needs abstracting.
>>
>> Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
>> prefix, and move their struct mon_config_info parameter into
>> <linux/resctrl.h>. This allows another architecture to supply an
>> implementation of these.
>>
>> As struct mon_config_info is now exposed globally, give it a 'resctrl_'
>> prefix. MPAM systems need access to the domain to do this work, add
>> the resource and domain to struct resctrl_mon_config_info.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 5c7b9760b63a..59d944e139f8 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -270,6 +270,13 @@ struct resctrl_cpu_defaults {
>> u32 rmid;
>> };
>>
>> +struct resctrl_mon_config_info {
>> + struct rdt_resource *r;
>> + struct rdt_mon_domain *d;
>> + u32 evtid;
>> + u32 mon_config;
>> +};
> Isn't this architecture specific definition? Why is this in common
> resctrl.h file.
Because mbm_config_write_domain() and mbm_config_show() need to pass this set of
information via IPI to another CPU to call resctrl_arch_mon_event_config_read() or
resctrl_arch_mon_event_config_write().
The definition can't belong to the arch code - otherwise it would have to be duplicated
across all architecture, and need the same members.
As much of the IPI-ing as possible is in the resctrl filesystem code, so that if we can
reduce them for one architecture, every architecture benefits.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 21/42] x86/resctrl: Move mba_mbps_default_event init to filesystem code
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (19 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 20/42] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:42 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 22/42] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
` (22 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
mba_mbps_default_event is initialised base on whether mbm_local or
mbm_total is supported. In the case of both, it is initialised to
mbm_local. mba_mbps_default_event is initialised in core.c's
get_rdt_mon_resources(), while all the readers are in rdtgroup.c.
After this code is split into architecture specific and filesystem code,
get_rdt_mon_resources() remains part of the architecture code, which
would mean mba_mbps_default_event has to be exposed by the filesystem
code.
Move the initialisation to the filesystem's resctrl_mon_resource_init()
Signed-off-by: James Morse <james.morse@arm.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 5 -----
arch/x86/kernel/cpu/resctrl/monitor.c | 5 +++++
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 43a9291988d3..1fb4eb4e0ea9 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -924,11 +924,6 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;
- if (resctrl_arch_is_mbm_local_enabled())
- mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
- else if (resctrl_arch_is_mbm_total_enabled())
- mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
-
return !rdt_get_mon_l3_config(r);
}
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ab8f33f2277e..17968cafc288 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1216,6 +1216,11 @@ int __init resctrl_mon_resource_init(void)
RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
}
+ if (resctrl_arch_is_mbm_local_enabled())
+ mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
+ else if (resctrl_arch_is_mbm_total_enabled())
+ mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
+
return 0;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 21/42] x86/resctrl: Move mba_mbps_default_event init to filesystem code
2025-02-07 18:18 ` [PATCH v6 21/42] x86/resctrl: Move mba_mbps_default_event init to filesystem code James Morse
@ 2025-02-20 0:42 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:42 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> mba_mbps_default_event is initialised base on whether mbm_local or
> mbm_total is supported. In the case of both, it is initialised to
> mbm_local. mba_mbps_default_event is initialised in core.c's
> get_rdt_mon_resources(), while all the readers are in rdtgroup.c.
>
> After this code is split into architecture specific and filesystem code,
> get_rdt_mon_resources() remains part of the architecture code, which
> would mean mba_mbps_default_event has to be exposed by the filesystem
> code.
>
> Move the initialisation to the filesystem's resctrl_mon_resource_init()
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 22/42] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (20 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 21/42] x86/resctrl: Move mba_mbps_default_event init to filesystem code James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:45 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
` (21 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The mbm_cfg_mask field lists the bits that user-space can set when
configuring an event. This value is output via the last_cmd_status
file.
Once the filesystem parts of resctrl are moved to live in /fs/, the
struct rdt_hw_resource is inaccessible to the filesystem code. Because
this value is output to user-space, it has to be accessible to the
filesystem code.
Move it to struct rdt_resource.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Rephrased a comment to remove one vendors marketing name for the feature.
Change since v1:
* Reword comments to avoid being overly arch-specific.
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++---
include/linux/resctrl.h | 3 +++
4 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 4a5996d1e060..725f223ea07b 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -417,8 +417,6 @@ struct msr_param {
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
- * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
- * Monitoring Event Configuration (BMEC) is supported.
* @cdp_enabled: CDP state of this resource
*
* Members of this structure are either private to the architecture
@@ -432,7 +430,6 @@ struct rdt_hw_resource {
void (*msr_update)(struct msr_param *m);
unsigned int mon_scale;
unsigned int mbm_width;
- unsigned int mbm_cfg_mask;
bool cdp_enabled;
};
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 17968cafc288..d99a05fc1b44 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1263,7 +1263,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
/* Detect list of bandwidth sources that can be tracked */
cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
- hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+ r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
}
r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e7d1d8b6983d..a388ef66ef4c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1731,7 +1731,6 @@ static void mbm_config_write_domain(struct rdt_resource *r,
static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
{
- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
char *dom_str = NULL, *id_str;
unsigned long dom_id, val;
struct rdt_mon_domain *d;
@@ -1758,9 +1757,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
}
/* Value from user cannot be more than the supported set of events */
- if ((val & hw_res->mbm_cfg_mask) != val) {
+ if ((val & r->mbm_cfg_mask) != val) {
rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n",
- hw_res->mbm_cfg_mask);
+ r->mbm_cfg_mask);
return -EINVAL;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 59d944e139f8..4d02e34c4401 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -217,6 +217,8 @@ enum resctrl_schema_fmt {
* @name: Name to use in "schemata" file.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
+ * @mbm_cfg_mask: Bandwidth sources that can be tracked when bandwidth
+ * monitoring events can be configured.
* @cdp_capable: Is the CDP feature available on this resource
*/
struct rdt_resource {
@@ -233,6 +235,7 @@ struct rdt_resource {
char *name;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
+ unsigned int mbm_cfg_mask;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 22/42] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
2025-02-07 18:18 ` [PATCH v6 22/42] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2025-02-20 0:45 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:45 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> The mbm_cfg_mask field lists the bits that user-space can set when
> configuring an event. This value is output via the last_cmd_status
> file.
>
> Once the filesystem parts of resctrl are moved to live in /fs/, the
> struct rdt_hw_resource is inaccessible to the filesystem code. Because
> this value is output to user-space, it has to be accessible to the
> filesystem code.
>
> Move it to struct rdt_resource.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (21 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 22/42] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:53 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 24/42] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
` (20 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
resctrl's pseudo lock has some copy-to-cache and measurement
functions that are micro-architecture specific.
For example, pseudo_lock_fn() is not at all portable.
Label these 'resctrl_arch_' so they stay under /arch/x86.
To expose these functions to the filesystem code they need an entry
in a header file, and can't be marked static.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Expanded commit message.
---
arch/x86/include/asm/resctrl.h | 5 ++++
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 36 ++++++++++++-----------
2 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 6d4c7ea2c9e3..86407dbde583 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -203,6 +203,11 @@ static inline void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid
static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
void *ctx) { };
+u64 resctrl_arch_get_prefetch_disable_bits(void);
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_measure_cycles_lat_fn(void *_plr);
+int resctrl_arch_measure_l2_residency(void *_plr);
+int resctrl_arch_measure_l3_residency(void *_plr);
void resctrl_cpu_detect(struct cpuinfo_x86 *c);
#else
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 42cc162f7fc9..d078b89380dd 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -61,7 +61,8 @@ static const struct class pseudo_lock_class = {
};
/**
- * get_prefetch_disable_bits - prefetch disable bits of supported platforms
+ * resctrl_arch_get_prefetch_disable_bits - prefetch disable bits of supported
+ * platforms
* @void: It takes no parameters.
*
* Capture the list of platforms that have been validated to support
@@ -75,13 +76,13 @@ static const struct class pseudo_lock_class = {
* in the SDM.
*
* When adding a platform here also add support for its cache events to
- * measure_cycles_perf_fn()
+ * resctrl_arch_measure_l*_residency()
*
* Return:
* If platform is supported, the bits to disable hardware prefetchers, 0
* if platform is not supported.
*/
-static u64 get_prefetch_disable_bits(void)
+u64 resctrl_arch_get_prefetch_disable_bits(void)
{
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
boot_cpu_data.x86 != 6)
@@ -408,7 +409,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
}
/**
- * pseudo_lock_fn - Load kernel memory into cache
+ * resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
* @_rdtgrp: resource group to which pseudo-lock region belongs
*
* This is the core pseudo-locking flow.
@@ -426,7 +427,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-static int pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
{
struct rdtgroup *rdtgrp = _rdtgrp;
struct pseudo_lock_region *plr = rdtgrp->plr;
@@ -712,7 +713,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* Not knowing the bits to disable prefetching implies that this
* platform does not support Cache Pseudo-Locking.
*/
- prefetch_disable_bits = get_prefetch_disable_bits();
+ prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
if (prefetch_disable_bits == 0) {
rdt_last_cmd_puts("Pseudo-locking not supported\n");
return -EINVAL;
@@ -872,7 +873,8 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
}
/**
- * measure_cycles_lat_fn - Measure cycle latency to read pseudo-locked memory
+ * resctrl_arch_measure_cycles_lat_fn - Measure cycle latency to read
+ * pseudo-locked memory
* @_plr: pseudo-lock region to measure
*
* There is no deterministic way to test if a memory region is cached. One
@@ -885,7 +887,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-static int measure_cycles_lat_fn(void *_plr)
+int resctrl_arch_measure_cycles_lat_fn(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
u32 saved_low, saved_high;
@@ -1069,7 +1071,7 @@ static int measure_residency_fn(struct perf_event_attr *miss_attr,
return 0;
}
-static int measure_l2_residency(void *_plr)
+int resctrl_arch_measure_l2_residency(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
struct residency_counts counts = {0};
@@ -1107,7 +1109,7 @@ static int measure_l2_residency(void *_plr)
return 0;
}
-static int measure_l3_residency(void *_plr)
+int resctrl_arch_measure_l3_residency(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
struct residency_counts counts = {0};
@@ -1205,14 +1207,14 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
plr->cpu = cpu;
if (sel == 1)
- thread = kthread_run_on_cpu(measure_cycles_lat_fn, plr,
- cpu, "pseudo_lock_measure/%u");
+ thread = kthread_run_on_cpu(resctrl_arch_measure_cycles_lat_fn,
+ plr, cpu, "pseudo_lock_measure/%u");
else if (sel == 2)
- thread = kthread_run_on_cpu(measure_l2_residency, plr,
- cpu, "pseudo_lock_measure/%u");
+ thread = kthread_run_on_cpu(resctrl_arch_measure_l2_residency,
+ plr, cpu, "pseudo_lock_measure/%u");
else if (sel == 3)
- thread = kthread_run_on_cpu(measure_l3_residency, plr,
- cpu, "pseudo_lock_measure/%u");
+ thread = kthread_run_on_cpu(resctrl_arch_measure_l3_residency,
+ plr, cpu, "pseudo_lock_measure/%u");
else
goto out;
@@ -1307,7 +1309,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
plr->thread_done = 0;
- thread = kthread_run_on_cpu(pseudo_lock_fn, rdtgrp,
+ thread = kthread_run_on_cpu(resctrl_arch_pseudo_lock_fn, rdtgrp,
plr->cpu, "pseudo_lock/%u");
if (IS_ERR(thread)) {
ret = PTR_ERR(thread);
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2025-02-07 18:18 ` [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
@ 2025-02-20 0:53 ` Reinette Chatre
2025-02-28 19:57 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:53 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> resctrl's pseudo lock has some copy-to-cache and measurement
> functions that are micro-architecture specific.
>
> For example, pseudo_lock_fn() is not at all portable.
>
> Label these 'resctrl_arch_' so they stay under /arch/x86.
> To expose these functions to the filesystem code they need an entry
> in a header file, and can't be marked static.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
...
> -static int measure_l3_residency(void *_plr)
> +int resctrl_arch_measure_l3_residency(void *_plr)
> {
> struct pseudo_lock_region *plr = _plr;
> struct residency_counts counts = {0};
> @@ -1205,14 +1207,14 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
> plr->cpu = cpu;
>
> if (sel == 1)
> - thread = kthread_run_on_cpu(measure_cycles_lat_fn, plr,
> - cpu, "pseudo_lock_measure/%u");
> + thread = kthread_run_on_cpu(resctrl_arch_measure_cycles_lat_fn,
> + plr, cpu, "pseudo_lock_measure/%u");
checkpatch.pl does not like this extra space that sneaked in.
With spacing fixed:
| Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2025-02-20 0:53 ` Reinette Chatre
@ 2025-02-28 19:57 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:57 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 00:53, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> resctrl's pseudo lock has some copy-to-cache and measurement
>> functions that are micro-architecture specific.
>>
>> For example, pseudo_lock_fn() is not at all portable.
>>
>> Label these 'resctrl_arch_' so they stay under /arch/x86.
>> To expose these functions to the filesystem code they need an entry
>> in a header file, and can't be marked static.
>> -static int measure_l3_residency(void *_plr)
>> +int resctrl_arch_measure_l3_residency(void *_plr)
>> {
>> struct pseudo_lock_region *plr = _plr;
>> struct residency_counts counts = {0};
>> @@ -1205,14 +1207,14 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
>> plr->cpu = cpu;
>>
>> if (sel == 1)
>> - thread = kthread_run_on_cpu(measure_cycles_lat_fn, plr,
>> - cpu, "pseudo_lock_measure/%u");
>> + thread = kthread_run_on_cpu(resctrl_arch_measure_cycles_lat_fn,
>> + plr, cpu, "pseudo_lock_measure/%u");
>
> checkpatch.pl does not like this extra space that sneaked in.
Looks like I missed the step to re-check that after the rebase.
> With spacing fixed:
> | Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 24/42] x86/resctrl: Allow an architecture to disable pseudo lock
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (22 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 23/42] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:56 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 25/42] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
` (19 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Pseudo-lock relies on knowledge of the micro-architecture to disable
prefetchers etc.
On arm64 these controls are typically secure only, meaning linux can't
access them. Arm's cache-lockdown feature works in a very different
way. Resctrl's pseudo-lock isn't going to be used on arm64 platforms.
Add a Kconfig symbol that can be selected by the architecture. This
enables or disables building of the pseudo_lock.c file, and replaces
the functions with stubs. An additional IS_ENABLED() check is needed
in rdtgroup_mode_write() so that attempting to enable pseudo-lock
reports an "Unknown or unsupported mode" to user-space via the
last_cmd_status file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v4:
* "last_cmd file" -> "last_cmd_status file"
Changes since v2:
* Clarified the commit message as to where the error string is printed.
Changes since v1:
* [Commit message only] Typo fix:
s/psuedo/pseudo/g
---
arch/x86/Kconfig | 7 ++++
arch/x86/kernel/cpu/resctrl/Makefile | 5 +--
arch/x86/kernel/cpu/resctrl/internal.h | 49 +++++++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 +-
4 files changed, 53 insertions(+), 11 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 87198d957e2f..41dda57c4953 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -505,6 +505,7 @@ config X86_CPU_RESCTRL
depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select KERNFS
select PROC_CPU_RESCTRL if PROC_FS
+ select RESCTRL_FS_PSEUDO_LOCK
help
Enable x86 CPU resource control support.
@@ -521,6 +522,12 @@ config X86_CPU_RESCTRL
Say N if unsure.
+config RESCTRL_FS_PSEUDO_LOCK
+ bool
+ help
+ Software mechanism to pin data in a cache portion using
+ micro-architecture specific knowledge.
+
config X86_FRED
bool "Flexible Return and Event Delivery"
depends on X86_64
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index 4a06c37b9cf1..0c13b0befd8a 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
-obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o pseudo_lock.o
+obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
+obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
CFLAGS_pseudo_lock.o = -I$(src)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 725f223ea07b..8d35bb423aad 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -512,14 +512,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
unsigned long cbm);
enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
int rdtgroup_tasks_assigned(struct rdtgroup *r);
-int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
-int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
-bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm);
-bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d);
-int rdt_pseudo_lock_init(void);
-void rdt_pseudo_lock_release(void);
-int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
-void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
int closids_supported(void);
@@ -551,4 +543,45 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
+
+#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
+int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
+int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
+bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm);
+bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d);
+int rdt_pseudo_lock_init(void);
+void rdt_pseudo_lock_release(void);
+int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
+void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
+#else
+static inline int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm)
+{
+ return false;
+}
+
+static inline bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
+{
+ return false;
+}
+
+static inline int rdt_pseudo_lock_init(void) { return 0; }
+static inline void rdt_pseudo_lock_release(void) { }
+static inline int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp) { }
+#endif /* CONFIG_RESCTRL_FS_PSEUDO_LOCK */
+
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a388ef66ef4c..e59271515a46 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1453,7 +1453,8 @@ static ssize_t rdtgroup_mode_write(struct kernfs_open_file *of,
goto out;
}
rdtgrp->mode = RDT_MODE_EXCLUSIVE;
- } else if (!strcmp(buf, "pseudo-locksetup")) {
+ } else if (IS_ENABLED(CONFIG_RESCTRL_FS_PSEUDO_LOCK) &&
+ !strcmp(buf, "pseudo-locksetup")) {
ret = rdtgroup_locksetup_enter(rdtgrp);
if (ret)
goto out;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 24/42] x86/resctrl: Allow an architecture to disable pseudo lock
2025-02-07 18:18 ` [PATCH v6 24/42] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
@ 2025-02-20 0:56 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:56 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> Pseudo-lock relies on knowledge of the micro-architecture to disable
> prefetchers etc.
>
> On arm64 these controls are typically secure only, meaning linux can't
> access them. Arm's cache-lockdown feature works in a very different
> way. Resctrl's pseudo-lock isn't going to be used on arm64 platforms.
>
> Add a Kconfig symbol that can be selected by the architecture. This
> enables or disables building of the pseudo_lock.c file, and replaces
> the functions with stubs. An additional IS_ENABLED() check is needed
> in rdtgroup_mode_write() so that attempting to enable pseudo-lock
> reports an "Unknown or unsupported mode" to user-space via the
> last_cmd_status file.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 25/42] x86/resctrl: Make prefetch_disable_bits belong to the arch code
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (23 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 24/42] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 0:59 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 26/42] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
` (18 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
value provided by the architecture, but is largely read by other
architecture helpers.
Make resctrl_arch_get_prefetch_disable_bits() set prefetch_disable_bits
so that it can be isolated to arch-code from where the other arch-code
helpers can use its cached-value.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Tweaked the word 'export'.
* Swapped the second paragraph for Reinette's version.
---
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index d078b89380dd..13145e744556 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -84,6 +84,8 @@ static const struct class pseudo_lock_class = {
*/
u64 resctrl_arch_get_prefetch_disable_bits(void)
{
+ prefetch_disable_bits = 0;
+
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
boot_cpu_data.x86 != 6)
return 0;
@@ -99,7 +101,8 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
* 3 DCU IP Prefetcher Disable (R/W)
* 63:4 Reserved
*/
- return 0xF;
+ prefetch_disable_bits = 0xF;
+ break;
case INTEL_ATOM_GOLDMONT:
case INTEL_ATOM_GOLDMONT_PLUS:
/*
@@ -110,10 +113,11 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
* 2 DCU Hardware Prefetcher Disable (R/W)
* 63:3 Reserved
*/
- return 0x5;
+ prefetch_disable_bits = 0x5;
+ break;
}
- return 0;
+ return prefetch_disable_bits;
}
/**
@@ -713,8 +717,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* Not knowing the bits to disable prefetching implies that this
* platform does not support Cache Pseudo-Locking.
*/
- prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
- if (prefetch_disable_bits == 0) {
+ if (resctrl_arch_get_prefetch_disable_bits() == 0) {
rdt_last_cmd_puts("Pseudo-locking not supported\n");
return -EINVAL;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 25/42] x86/resctrl: Make prefetch_disable_bits belong to the arch code
2025-02-07 18:18 ` [PATCH v6 25/42] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
@ 2025-02-20 0:59 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 0:59 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
> value provided by the architecture, but is largely read by other
> architecture helpers.
>
> Make resctrl_arch_get_prefetch_disable_bits() set prefetch_disable_bits
> so that it can be isolated to arch-code from where the other arch-code
> helpers can use its cached-value.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 26/42] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (24 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 25/42] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 1:03 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl James Morse
` (17 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
resctrl_arch_pseudo_lock_fn() has architecture specific behaviour,
and takes a struct rdtgroup as an argument.
After the filesystem code moves to /fs/, the definition of struct
rdtgroup will not be available to the architecture code.
The only reason resctrl_arch_pseudo_lock_fn() wants the rdtgroup is
for the CLOSID. Embed that in the pseudo_lock_region as a closid,
and move the definition of struct pseudo_lock_region to resctrl.h.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Change since v1:
* [Commit message only] Typo fix:
s/hw_closid/closid/g
---
arch/x86/include/asm/resctrl.h | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 37 ---------------------
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 12 +++----
include/linux/resctrl.h | 39 +++++++++++++++++++++++
4 files changed, 46 insertions(+), 44 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 86407dbde583..011bf67a1866 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -204,7 +204,7 @@ static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
void *ctx) { };
u64 resctrl_arch_get_prefetch_disable_bits(void);
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_pseudo_lock_fn(void *_plr);
int resctrl_arch_measure_cycles_lat_fn(void *_plr);
int resctrl_arch_measure_l2_residency(void *_plr);
int resctrl_arch_measure_l3_residency(void *_plr);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 8d35bb423aad..0d13006e920b 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -208,43 +208,6 @@ struct mongroup {
u32 rmid;
};
-/**
- * struct pseudo_lock_region - pseudo-lock region information
- * @s: Resctrl schema for the resource to which this
- * pseudo-locked region belongs
- * @d: RDT domain to which this pseudo-locked region
- * belongs
- * @cbm: bitmask of the pseudo-locked region
- * @lock_thread_wq: waitqueue used to wait on the pseudo-locking thread
- * completion
- * @thread_done: variable used by waitqueue to test if pseudo-locking
- * thread completed
- * @cpu: core associated with the cache on which the setup code
- * will be run
- * @line_size: size of the cache lines
- * @size: size of pseudo-locked region in bytes
- * @kmem: the kernel memory associated with pseudo-locked region
- * @minor: minor number of character device associated with this
- * region
- * @debugfs_dir: pointer to this region's directory in the debugfs
- * filesystem
- * @pm_reqs: Power management QoS requests related to this region
- */
-struct pseudo_lock_region {
- struct resctrl_schema *s;
- struct rdt_ctrl_domain *d;
- u32 cbm;
- wait_queue_head_t lock_thread_wq;
- int thread_done;
- int cpu;
- unsigned int line_size;
- unsigned int size;
- void *kmem;
- unsigned int minor;
- struct dentry *debugfs_dir;
- struct list_head pm_reqs;
-};
-
/**
* struct rdtgroup - store rdtgroup's data in resctrl file system.
* @kn: kernfs node
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 13145e744556..e7f713eb4490 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -414,7 +414,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
/**
* resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
- * @_rdtgrp: resource group to which pseudo-lock region belongs
+ * @_plr: the pseudo-lock region descriptor
*
* This is the core pseudo-locking flow.
*
@@ -431,10 +431,9 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_plr)
{
- struct rdtgroup *rdtgrp = _rdtgrp;
- struct pseudo_lock_region *plr = rdtgrp->plr;
+ struct pseudo_lock_region *plr = _plr;
u32 rmid_p, closid_p;
unsigned long i;
u64 saved_msr;
@@ -494,7 +493,8 @@ int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
* pseudo-locked followed by reading of kernel memory to load it
* into the cache.
*/
- __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, rdtgrp->closid);
+ __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, plr->closid);
+
/*
* Cache was flushed earlier. Now access kernel memory to read it
* into cache region associated with just activated plr->closid.
@@ -1312,7 +1312,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
plr->thread_done = 0;
- thread = kthread_run_on_cpu(resctrl_arch_pseudo_lock_fn, rdtgrp,
+ thread = kthread_run_on_cpu(resctrl_arch_pseudo_lock_fn, plr,
plr->cpu, "pseudo_lock/%u");
if (IS_ERR(thread)) {
ret = PTR_ERR(thread);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4d02e34c4401..524f35b5532b 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -58,6 +58,45 @@ enum resctrl_conf_type {
#define CDP_NUM_TYPES (CDP_DATA + 1)
+/*
+ * struct pseudo_lock_region - pseudo-lock region information
+ * @s: Resctrl schema for the resource to which this
+ * pseudo-locked region belongs
+ * @closid: The closid that this pseudo-locked region uses
+ * @d: RDT domain to which this pseudo-locked region
+ * belongs
+ * @cbm: bitmask of the pseudo-locked region
+ * @lock_thread_wq: waitqueue used to wait on the pseudo-locking thread
+ * completion
+ * @thread_done: variable used by waitqueue to test if pseudo-locking
+ * thread completed
+ * @cpu: core associated with the cache on which the setup code
+ * will be run
+ * @line_size: size of the cache lines
+ * @size: size of pseudo-locked region in bytes
+ * @kmem: the kernel memory associated with pseudo-locked region
+ * @minor: minor number of character device associated with this
+ * region
+ * @debugfs_dir: pointer to this region's directory in the debugfs
+ * filesystem
+ * @pm_reqs: Power management QoS requests related to this region
+ */
+struct pseudo_lock_region {
+ struct resctrl_schema *s;
+ u32 closid;
+ struct rdt_ctrl_domain *d;
+ u32 cbm;
+ wait_queue_head_t lock_thread_wq;
+ int thread_done;
+ int cpu;
+ unsigned int line_size;
+ unsigned int size;
+ void *kmem;
+ unsigned int minor;
+ struct dentry *debugfs_dir;
+ struct list_head pm_reqs;
+};
+
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @new_ctrl: new ctrl value to be loaded
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 26/42] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
2025-02-07 18:18 ` [PATCH v6 26/42] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
@ 2025-02-20 1:03 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 1:03 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> resctrl_arch_pseudo_lock_fn() has architecture specific behaviour,
> and takes a struct rdtgroup as an argument.
>
> After the filesystem code moves to /fs/, the definition of struct
> rdtgroup will not be available to the architecture code.
>
> The only reason resctrl_arch_pseudo_lock_fn() wants the rdtgroup is
> for the CLOSID. Embed that in the pseudo_lock_region as a closid,
> and move the definition of struct pseudo_lock_region to resctrl.h.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (25 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 26/42] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 1:17 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources James Morse
` (16 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
resctrl_file_fflags_init() is called from the architecture specific code
to make the 'thread_throttle_mode' file visible. The architecture specific
code has already set the membw.throttle_mode in the rdt_resource.
This forces the RFTYPE flags used by resctrl to be exposed to the
architecture specific code.
This doesn't need to be specific to the architecture, the throttle_mode
can be used by resctrl to determine if the 'thread_throttle_mode' file
should be visible. This allows the RFTYPE flags to be private to resctrl.
Add thread_throttle_mode_init(), and use it to call
resctrl_file_fflags_init() from resctrl_setup(). This avoids
publishing an extra function between the architecture and filesystem
code.
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
* Added checking for SMBA.
* Added printing of undefined to rdt_thread_throttle_mode_show().
* Major juggling around commit 2937f9c361f7 ("x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflags")
* Dropped tags, split patch
---
arch/x86/kernel/cpu/resctrl/core.c | 3 ---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++++++++++++
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 1fb4eb4e0ea9..e4b676879227 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -229,9 +229,6 @@ static __init bool __get_mem_config_intel(struct rdt_resource *r)
else
r->membw.throttle_mode = THREAD_THROTTLE_MAX;
- resctrl_file_fflags_init("thread_throttle_mode",
- RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
-
r->alloc_capable = true;
return true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e59271515a46..58feba3feefd 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2064,6 +2064,16 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
return NULL;
}
+static void thread_throttle_mode_init(void)
+{
+ struct rdt_resource *r_mba;
+
+ r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
+ if (r_mba->membw.throttle_mode != THREAD_THROTTLE_UNDEFINED)
+ resctrl_file_fflags_init("thread_throttle_mode",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
+}
+
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
{
struct rftype *rft;
@@ -4277,6 +4287,8 @@ int __init resctrl_init(void)
rdtgroup_setup_default();
+ thread_throttle_mode_init();
+
ret = resctrl_mon_resource_init();
if (ret)
return ret;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl
2025-02-07 18:18 ` [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl James Morse
@ 2025-02-20 1:17 ` Reinette Chatre
2025-02-28 19:56 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 1:17 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> resctrl_file_fflags_init() is called from the architecture specific code
> to make the 'thread_throttle_mode' file visible. The architecture specific
> code has already set the membw.throttle_mode in the rdt_resource.
>
> This forces the RFTYPE flags used by resctrl to be exposed to the
> architecture specific code.
>
> This doesn't need to be specific to the architecture, the throttle_mode
> can be used by resctrl to determine if the 'thread_throttle_mode' file
> should be visible. This allows the RFTYPE flags to be private to resctrl.
>
> Add thread_throttle_mode_init(), and use it to call
> resctrl_file_fflags_init() from resctrl_setup(). This avoids
" from resctrl_setup()" -> " from resctrl_init()" ?
> publishing an extra function between the architecture and filesystem
> code.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
| Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl
2025-02-20 1:17 ` Reinette Chatre
@ 2025-02-28 19:56 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:56 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi Reinette,
On 20/02/2025 01:17, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> resctrl_file_fflags_init() is called from the architecture specific code
>> to make the 'thread_throttle_mode' file visible. The architecture specific
>> code has already set the membw.throttle_mode in the rdt_resource.
>>
>> This forces the RFTYPE flags used by resctrl to be exposed to the
>> architecture specific code.
>>
>> This doesn't need to be specific to the architecture, the throttle_mode
>> can be used by resctrl to determine if the 'thread_throttle_mode' file
>> should be visible. This allows the RFTYPE flags to be private to resctrl.
>>
>> Add thread_throttle_mode_init(), and use it to call
>> resctrl_file_fflags_init() from resctrl_setup(). This avoids
>
> " from resctrl_setup()" -> " from resctrl_init()" ?
Ugh, thanks!
>> publishing an extra function between the architecture and filesystem
>> code.
> | Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (26 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 27/42] x86/resctrl: Move RFTYPE flags to be managed by resctrl James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 1:20 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header James Morse
` (15 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
Now that the visibility of throttle_mode is being managed by resctrl, it
should consider resources other than MBA that may have a throttle_mode.
SMBA is one such resource.
Extend resctrl_file_fflags_init() to check SMBA for a throttle_mode.
Adding support for multiple resources means it is possible for a platform
with both MBA and SMBA, but an undefined throttle_mode on one of them
to make the file visible.
Add the 'undefined' case to rdt_thread_throttle_mode_show().
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
* This change split out of the previous patch.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 33 +++++++++++++++++++++-----
1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 58feba3feefd..5fc60c9ce28f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1188,10 +1188,19 @@ static int rdt_thread_throttle_mode_show(struct kernfs_open_file *of,
struct resctrl_schema *s = of->kn->parent->priv;
struct rdt_resource *r = s->res;
- if (r->membw.throttle_mode == THREAD_THROTTLE_PER_THREAD)
+ switch (r->membw.throttle_mode) {
+ case THREAD_THROTTLE_PER_THREAD:
seq_puts(seq, "per-thread\n");
- else
+ return 0;
+ case THREAD_THROTTLE_MAX:
seq_puts(seq, "max\n");
+ return 0;
+ case THREAD_THROTTLE_UNDEFINED:
+ seq_puts(seq, "undefined\n");
+ return 0;
+ }
+
+ WARN_ON_ONCE(1);
return 0;
}
@@ -2066,12 +2075,24 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
static void thread_throttle_mode_init(void)
{
- struct rdt_resource *r_mba;
+ enum membw_throttle_mode throttle_mode = THREAD_THROTTLE_UNDEFINED;
+ struct rdt_resource *r_mba, *r_smba;
r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
- if (r_mba->membw.throttle_mode != THREAD_THROTTLE_UNDEFINED)
- resctrl_file_fflags_init("thread_throttle_mode",
- RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
+ if (r_mba->alloc_capable &&
+ r_mba->membw.throttle_mode != THREAD_THROTTLE_UNDEFINED)
+ throttle_mode = r_mba->membw.throttle_mode;
+
+ r_smba = resctrl_arch_get_resource(RDT_RESOURCE_SMBA);
+ if (r_smba->alloc_capable &&
+ r_smba->membw.throttle_mode != THREAD_THROTTLE_UNDEFINED)
+ throttle_mode = r_smba->membw.throttle_mode;
+
+ if (throttle_mode == THREAD_THROTTLE_UNDEFINED)
+ return;
+
+ resctrl_file_fflags_init("thread_throttle_mode",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
}
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources
2025-02-07 18:18 ` [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources James Morse
@ 2025-02-20 1:20 ` Reinette Chatre
2025-02-28 19:55 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 1:20 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> Now that the visibility of throttle_mode is being managed by resctrl, it
> should consider resources other than MBA that may have a throttle_mode.
> SMBA is one such resource.
>
> Extend resctrl_file_fflags_init() to check SMBA for a throttle_mode.
>
"Extend resctrl_file_fflags_init()" -> "Extend thread_throttle_mode_init()"?
> Adding support for multiple resources means it is possible for a platform
> with both MBA and SMBA, but an undefined throttle_mode on one of them
> to make the file visible.
>
> Add the 'undefined' case to rdt_thread_throttle_mode_show().
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
| Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources
2025-02-20 1:20 ` Reinette Chatre
@ 2025-02-28 19:55 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:55 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi Reinette,
On 20/02/2025 01:20, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> Now that the visibility of throttle_mode is being managed by resctrl, it
>> should consider resources other than MBA that may have a throttle_mode.
>> SMBA is one such resource.
>>
>> Extend resctrl_file_fflags_init() to check SMBA for a throttle_mode.
>>
>
> "Extend resctrl_file_fflags_init()" -> "Extend thread_throttle_mode_init()"?
Gah, more rebase noise.
>> Adding support for multiple resources means it is possible for a platform
>> with both MBA and SMBA, but an undefined throttle_mode on one of them
>> to make the file visible.
>>
>> Add the 'undefined' case to rdt_thread_throttle_mode_show().
> | Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (27 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 28/42] x86/resctrl: Handle throttle_mode for SMBA resources James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 1:27 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
` (14 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
get_config_index() is used by the architecture specific code to map a
CLOSID+type pair to an index in the configuration arrays.
MPAM needs to do this too to preserve the ABI to user-space, there is
no reason to do it differently.
Move the helper to a header file.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v1:
* Reindent resctrl_get_config_index() as per coding-style.rst rules.
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++----------------
include/linux/resctrl.h | 15 +++++++++++++++
2 files changed, 18 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index a93b40ea0bad..032a585293af 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -287,25 +287,12 @@ static int parse_line(char *line, struct resctrl_schema *s,
return -EINVAL;
}
-static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
-{
- switch (type) {
- default:
- case CDP_NONE:
- return closid;
- case CDP_CODE:
- return closid * 2 + 1;
- case CDP_DATA:
- return closid * 2;
- }
-}
-
int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type t, u32 cfg_val)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(d);
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
- u32 idx = get_config_index(closid, t);
+ u32 idx = resctrl_get_config_index(closid, t);
struct msr_param msr_param;
if (!cpumask_test_cpu(smp_processor_id(), &d->hdr.cpu_mask))
@@ -342,7 +329,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!cfg->have_new_ctrl)
continue;
- idx = get_config_index(closid, t);
+ idx = resctrl_get_config_index(closid, t);
if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
continue;
hw_dom->ctrl_val[idx] = cfg->new_ctrl;
@@ -462,7 +449,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type type)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(d);
- u32 idx = get_config_index(closid, type);
+ u32 idx = resctrl_get_config_index(closid, type);
return hw_dom->ctrl_val[idx];
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 524f35b5532b..29415d023aab 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
*/
void resctrl_arch_mon_event_config_read(void *config_info);
+/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
+static inline u32 resctrl_get_config_index(u32 closid,
+ enum resctrl_conf_type type)
+{
+ switch (type) {
+ default:
+ case CDP_NONE:
+ return closid;
+ case CDP_CODE:
+ return closid * 2 + 1;
+ case CDP_DATA:
+ return closid * 2;
+ }
+}
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-02-07 18:18 ` [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header James Morse
@ 2025-02-20 1:27 ` Reinette Chatre
2025-02-28 19:51 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 1:27 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> get_config_index() is used by the architecture specific code to map a
> CLOSID+type pair to an index in the configuration arrays.
>
> MPAM needs to do this too to preserve the ABI to user-space, there is
> no reason to do it differently.
>
> Move the helper to a header file.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v1:
> * Reindent resctrl_get_config_index() as per coding-style.rst rules.
> ---
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++----------------
> include/linux/resctrl.h | 15 +++++++++++++++
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index a93b40ea0bad..032a585293af 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -287,25 +287,12 @@ static int parse_line(char *line, struct resctrl_schema *s,
> return -EINVAL;
> }
>
> -static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
> -{
> - switch (type) {
> - default:
> - case CDP_NONE:
> - return closid;
> - case CDP_CODE:
> - return closid * 2 + 1;
> - case CDP_DATA:
> - return closid * 2;
> - }
> -}
> -
...
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
> */
> void resctrl_arch_mon_event_config_read(void *config_info);
>
> +/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
> +static inline u32 resctrl_get_config_index(u32 closid,
> + enum resctrl_conf_type type)
> +{
> + switch (type) {
> + default:
> + case CDP_NONE:
> + return closid;
> + case CDP_CODE:
> + return closid * 2 + 1;
> + case CDP_DATA:
> + return closid * 2;
> + }
> +}
> +
Could you please add the motivation for the use of an inline function?
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-02-20 1:27 ` Reinette Chatre
@ 2025-02-28 19:51 ` James Morse
2025-03-01 2:28 ` Reinette Chatre
0 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-28 19:51 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 01:27, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> get_config_index() is used by the architecture specific code to map a
>> CLOSID+type pair to an index in the configuration arrays.
>>
>> MPAM needs to do this too to preserve the ABI to user-space, there is
>> no reason to do it differently.
>>
>> Move the helper to a header file.
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
>> */
>> void resctrl_arch_mon_event_config_read(void *config_info);
>>
>> +/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>> +static inline u32 resctrl_get_config_index(u32 closid,
>> + enum resctrl_conf_type type)
>> +{
>> + switch (type) {
>> + default:
>> + case CDP_NONE:
>> + return closid;
>> + case CDP_CODE:
>> + return closid * 2 + 1;
>> + case CDP_DATA:
>> + return closid * 2;
>> + }
>> +}
>> +
> Could you please add the motivation for the use of an inline function?
Putting this in the header file means it isn't duplicated, so its behaviour can't become
different. If its in a header file, it has to be marked inline otherwise every C file that
includes it gets a copy that probably isn't used, and upsets the linker.
Calling from the arch code into the filesystem prevents the arch code from being
standalone. This is a useful direction of travel because it allows fs/resctrl to one
day become a module
Today, the compiler is choosing to inline this:
| x86_64-linux-objdump -d ctrlmondata.o | grep resctrl_get_config_index | wc -l
| 0
This kind of arithmetic for an array lookup is the kind of thing its good to give the
compiler full visibility of as its good fodder for constant folding.
For so few call sites, I don't think this is really worth thinking about.
Forcing this call out of line makes the kernel text bigger, but only by 32 bytes.
I've expanded the last paragraph of the commit message to read:
| Move the helper to a header file to allow all architectures that either
| use or emulate CDP to use the same pattern of CLOSID values. Moving
| this to a header file means it must be marked inline, which matches
| the existing compiler choice for this static function.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-02-28 19:51 ` James Morse
@ 2025-03-01 2:28 ` Reinette Chatre
2025-03-06 19:28 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-03-01 2:28 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/28/25 11:51 AM, James Morse wrote:
> Hi Reinette,
>
> On 20/02/2025 01:27, Reinette Chatre wrote:
>> On 2/7/25 10:18 AM, James Morse wrote:
>>> get_config_index() is used by the architecture specific code to map a
>>> CLOSID+type pair to an index in the configuration arrays.
>>>
>>> MPAM needs to do this too to preserve the ABI to user-space, there is
>>> no reason to do it differently.
>>>
>>> Move the helper to a header file.
>
>>> --- a/include/linux/resctrl.h
>>> +++ b/include/linux/resctrl.h
>>> @@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
>>> */
>>> void resctrl_arch_mon_event_config_read(void *config_info);
>>>
>>> +/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>>> +static inline u32 resctrl_get_config_index(u32 closid,
>>> + enum resctrl_conf_type type)
>>> +{
>>> + switch (type) {
>>> + default:
>>> + case CDP_NONE:
>>> + return closid;
>>> + case CDP_CODE:
>>> + return closid * 2 + 1;
>>> + case CDP_DATA:
>>> + return closid * 2;
>>> + }
>>> +}
>>> +
>
>> Could you please add the motivation for the use of an inline function?
>
> Putting this in the header file means it isn't duplicated, so its behaviour can't become
I am not following this. How would making this part of a .c file of fs/resctrl with just
the prototype in include/linux/resctrl.h result in this function being duplicated?
> different. If its in a header file, it has to be marked inline otherwise every C file that
> includes it gets a copy that probably isn't used, and upsets the linker.
>
> Calling from the arch code into the filesystem prevents the arch code from being
> standalone. This is a useful direction of travel because it allows fs/resctrl to one
> day become a module
Don't we have this already with all the needed CPU and domain management (
resctrl_online_ctrl_domain(), resctrl_online_mon_domain(), resctrl_online_cpu(),
resctrl_offline_cpu(), etc.)?
>
> Today, the compiler is choosing to inline this:
> | x86_64-linux-objdump -d ctrlmondata.o | grep resctrl_get_config_index | wc -l
> | 0
>
> This kind of arithmetic for an array lookup is the kind of thing its good to give the
> compiler full visibility of as its good fodder for constant folding.
>
> For so few call sites, I don't think this is really worth thinking about.
> Forcing this call out of line makes the kernel text bigger, but only by 32 bytes.
>
>
> I've expanded the last paragraph of the commit message to read:
> | Move the helper to a header file to allow all architectures that either
> | use or emulate CDP to use the same pattern of CLOSID values. Moving
> | this to a header file means it must be marked inline, which matches
> | the existing compiler choice for this static function.
>
>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-03-01 2:28 ` Reinette Chatre
@ 2025-03-06 19:28 ` James Morse
2025-03-06 22:52 ` Reinette Chatre
0 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-03-06 19:28 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 01/03/2025 02:28, Reinette Chatre wrote:
> On 2/28/25 11:51 AM, James Morse wrote:
>> On 20/02/2025 01:27, Reinette Chatre wrote:
>>> On 2/7/25 10:18 AM, James Morse wrote:
>>>> get_config_index() is used by the architecture specific code to map a
>>>> CLOSID+type pair to an index in the configuration arrays.
>>>>
>>>> MPAM needs to do this too to preserve the ABI to user-space, there is
>>>> no reason to do it differently.
>>>>
>>>> Move the helper to a header file.
>>
>>>> --- a/include/linux/resctrl.h
>>>> +++ b/include/linux/resctrl.h
>>>> @@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
>>>> */
>>>> void resctrl_arch_mon_event_config_read(void *config_info);
>>>>
>>>> +/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>>>> +static inline u32 resctrl_get_config_index(u32 closid,
>>>> + enum resctrl_conf_type type)
>>>> +{
>>>> + switch (type) {
>>>> + default:
>>>> + case CDP_NONE:
>>>> + return closid;
>>>> + case CDP_CODE:
>>>> + return closid * 2 + 1;
>>>> + case CDP_DATA:
>>>> + return closid * 2;
>>>> + }
>>>> +}
>>>> +
>>
>>> Could you please add the motivation for the use of an inline function?
>>
>> Putting this in the header file means it isn't duplicated, so its behaviour can't become
>
> I am not following this. How would making this part of a .c file of fs/resctrl with just
> the prototype in include/linux/resctrl.h result in this function being duplicated?
Ah, I misread this as one of the functions marked resctrl_arch_.
>> different. If its in a header file, it has to be marked inline otherwise every C file that
>> includes it gets a copy that probably isn't used, and upsets the linker.
>>
>> Calling from the arch code into the filesystem prevents the arch code from being
>> standalone. This is a useful direction of travel because it allows fs/resctrl to one
>> day become a module
> Don't we have this already with all the needed CPU and domain management (
> resctrl_online_ctrl_domain(), resctrl_online_mon_domain(), resctrl_online_cpu(),
> resctrl_offline_cpu(), etc.)?
And the realloc threshold, yes. These are the things that would need further abstraction
to allow the filesystem to be a module that isn't loaded. But these would all be changes
to the existing behaviour.
This one is just putting the definition in a header.
>> Today, the compiler is choosing to inline this:
>> | x86_64-linux-objdump -d ctrlmondata.o | grep resctrl_get_config_index | wc -l
>> | 0
>>
>> This kind of arithmetic for an array lookup is the kind of thing its good to give the
>> compiler full visibility of as its good fodder for constant folding.
>>
>> For so few call sites, I don't think this is really worth thinking about.
>> Forcing this call out of line makes the kernel text bigger, but only by 32 bytes.
>>
>>
>> I've expanded the last paragraph of the commit message to read:
>> | Move the helper to a header file to allow all architectures that either
>> | use or emulate CDP to use the same pattern of CLOSID values. Moving
>> | this to a header file means it must be marked inline, which matches
>> | the existing compiler choice for this static function.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header
2025-03-06 19:28 ` James Morse
@ 2025-03-06 22:52 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-03-06 22:52 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 3/6/25 11:28 AM, James Morse wrote:
> Hi Reinette,
>
> On 01/03/2025 02:28, Reinette Chatre wrote:
>> On 2/28/25 11:51 AM, James Morse wrote:
>>> On 20/02/2025 01:27, Reinette Chatre wrote:
>>>> On 2/7/25 10:18 AM, James Morse wrote:
>>>>> get_config_index() is used by the architecture specific code to map a
>>>>> CLOSID+type pair to an index in the configuration arrays.
>>>>>
>>>>> MPAM needs to do this too to preserve the ABI to user-space, there is
>>>>> no reason to do it differently.
>>>>>
>>>>> Move the helper to a header file.
>>>
>>>>> --- a/include/linux/resctrl.h
>>>>> +++ b/include/linux/resctrl.h
>>>>> @@ -384,6 +384,21 @@ void resctrl_arch_mon_event_config_write(void *config_info);
>>>>> */
>>>>> void resctrl_arch_mon_event_config_read(void *config_info);
>>>>>
>>>>> +/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>>>>> +static inline u32 resctrl_get_config_index(u32 closid,
>>>>> + enum resctrl_conf_type type)
>>>>> +{
>>>>> + switch (type) {
>>>>> + default:
>>>>> + case CDP_NONE:
>>>>> + return closid;
>>>>> + case CDP_CODE:
>>>>> + return closid * 2 + 1;
>>>>> + case CDP_DATA:
>>>>> + return closid * 2;
>>>>> + }
>>>>> +}
>>>>> +
>>>
>>>> Could you please add the motivation for the use of an inline function?
>>>
>>> Putting this in the header file means it isn't duplicated, so its behaviour can't become
>>
>> I am not following this. How would making this part of a .c file of fs/resctrl with just
>> the prototype in include/linux/resctrl.h result in this function being duplicated?
>
> Ah, I misread this as one of the functions marked resctrl_arch_.
>
>
>>> different. If its in a header file, it has to be marked inline otherwise every C file that
>>> includes it gets a copy that probably isn't used, and upsets the linker.
>>>
>>> Calling from the arch code into the filesystem prevents the arch code from being
>>> standalone. This is a useful direction of travel because it allows fs/resctrl to one
>>> day become a module
>
>> Don't we have this already with all the needed CPU and domain management (
>> resctrl_online_ctrl_domain(), resctrl_online_mon_domain(), resctrl_online_cpu(),
>> resctrl_offline_cpu(), etc.)?
>
> And the realloc threshold, yes. These are the things that would need further abstraction
> to allow the filesystem to be a module that isn't loaded. But these would all be changes
> to the existing behaviour.
> This one is just putting the definition in a header.
hmmm ... this seems a stretch as an argument since filesystem is not currently a module
and cannot currently be a module. Adding a declaration to a header file that matches with
existing usage seems reasonable to me.
>
>
>>> Today, the compiler is choosing to inline this:
>>> | x86_64-linux-objdump -d ctrlmondata.o | grep resctrl_get_config_index | wc -l
>>> | 0
>>>
>>> This kind of arithmetic for an array lookup is the kind of thing its good to give the
>>> compiler full visibility of as its good fodder for constant folding.
>>>
>>> For so few call sites, I don't think this is really worth thinking about.
>>> Forcing this call out of line makes the kernel text bigger, but only by 32 bytes.
>>>
>>>
>>> I've expanded the last paragraph of the commit message to read:
>>> | Move the helper to a header file to allow all architectures that either
>>> | use or emulate CDP to use the same pattern of CLOSID values. Moving
>>> | this to a header file means it must be marked inline, which matches
>>> | the existing compiler choice for this static function.
This is fair and seems to be only valid reason.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (28 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 29/42] x86/resctrl: Move get_config_index() to a header James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:08 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID James Morse
` (13 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
get_{mon,ctrl}_domain_from_cpu() are handy helpers that both the arch
code and resctrl need to use. Rename them to have a resctrl_ prefix
and move them to a header file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes from v5:
* Added the word from to a comment.
---
arch/x86/kernel/cpu/resctrl/core.c | 30 ---------------------
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
include/linux/resctrl.h | 37 ++++++++++++++++++++++++++
5 files changed, 39 insertions(+), 34 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index e4b676879227..921c351d57ae 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -348,36 +348,6 @@ static void cat_wrmsr(struct msr_param *m)
wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}
-struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
-{
- struct rdt_ctrl_domain *d;
-
- lockdep_assert_cpus_held();
-
- list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
- /* Find the domain that contains this CPU */
- if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
- return d;
- }
-
- return NULL;
-}
-
-struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
-{
- struct rdt_mon_domain *d;
-
- lockdep_assert_cpus_held();
-
- list_for_each_entry(d, &r->mon_domains, hdr.list) {
- /* Find the domain that contains this CPU */
- if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
- return d;
- }
-
- return NULL;
-}
-
u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
{
return resctrl_to_arch_res(r)->num_closid;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 0d13006e920b..c44c5b496355 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -475,8 +475,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
unsigned long cbm);
enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
int rdtgroup_tasks_assigned(struct rdtgroup *r);
-struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
-struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
int closids_supported(void);
void closid_free(int closid);
int alloc_rmid(u32 closid);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index d99a05fc1b44..470cf16f506e 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -773,7 +773,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
if (WARN_ON_ONCE(!pmbm_data))
return;
- dom_mba = get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
+ dom_mba = resctrl_get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
if (!dom_mba) {
pr_warn_once("Failure to get domain for MBA update\n");
return;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5fc60c9ce28f..08fec23a38bf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4274,7 +4274,7 @@ void resctrl_offline_cpu(unsigned int cpu)
if (!l3->mon_capable)
goto out_unlock;
- d = get_mon_domain_from_cpu(cpu, l3);
+ d = resctrl_get_mon_domain_from_cpu(cpu, l3);
if (d) {
if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 29415d023aab..511dab4ffcdc 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -3,6 +3,7 @@
#define _RESCTRL_H
#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pid.h>
@@ -399,6 +400,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
}
}
+/*
+ * Caller must hold the cpuhp read lock to prevent the struct rdt_domain from
+ * being freed.
+ */
+static inline struct rdt_ctrl_domain *
+resctrl_get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
+{
+ struct rdt_ctrl_domain *d;
+
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ /* Find the domain that contains this CPU */
+ if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
+ return d;
+ }
+
+ return NULL;
+}
+
+static inline struct rdt_mon_domain *
+resctrl_get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
+{
+ struct rdt_mon_domain *d;
+
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ /* Find the domain that contains this CPU */
+ if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
+ return d;
+ }
+
+ return NULL;
+}
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2025-02-07 18:18 ` [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
@ 2025-02-20 4:08 ` Reinette Chatre
2025-02-27 23:05 ` Fenghua Yu
2025-02-28 19:53 ` James Morse
0 siblings, 2 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:08 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> get_{mon,ctrl}_domain_from_cpu() are handy helpers that both the arch
> code and resctrl need to use. Rename them to have a resctrl_ prefix
> and move them to a header file.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes from v5:
> * Added the word from to a comment.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 30 ---------------------
> arch/x86/kernel/cpu/resctrl/internal.h | 2 --
> arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
> include/linux/resctrl.h | 37 ++++++++++++++++++++++++++
> 5 files changed, 39 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index e4b676879227..921c351d57ae 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -348,36 +348,6 @@ static void cat_wrmsr(struct msr_param *m)
> wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
> }
>
> -struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
> -{
> - struct rdt_ctrl_domain *d;
> -
> - lockdep_assert_cpus_held();
> -
> - list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> - /* Find the domain that contains this CPU */
> - if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
> - return d;
> - }
> -
> - return NULL;
> -}
> -
> -struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
> -{
> - struct rdt_mon_domain *d;
> -
> - lockdep_assert_cpus_held();
> -
> - list_for_each_entry(d, &r->mon_domains, hdr.list) {
> - /* Find the domain that contains this CPU */
> - if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
> - return d;
> - }
> -
> - return NULL;
> -}
> -
> u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
> {
> return resctrl_to_arch_res(r)->num_closid;
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 0d13006e920b..c44c5b496355 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -475,8 +475,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
> unsigned long cbm);
> enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
> int rdtgroup_tasks_assigned(struct rdtgroup *r);
> -struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
> -struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
> int closids_supported(void);
> void closid_free(int closid);
> int alloc_rmid(u32 closid);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index d99a05fc1b44..470cf16f506e 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -773,7 +773,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> if (WARN_ON_ONCE(!pmbm_data))
> return;
>
> - dom_mba = get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
> + dom_mba = resctrl_get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
> if (!dom_mba) {
> pr_warn_once("Failure to get domain for MBA update\n");
> return;
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 5fc60c9ce28f..08fec23a38bf 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4274,7 +4274,7 @@ void resctrl_offline_cpu(unsigned int cpu)
> if (!l3->mon_capable)
> goto out_unlock;
>
> - d = get_mon_domain_from_cpu(cpu, l3);
> + d = resctrl_get_mon_domain_from_cpu(cpu, l3);
> if (d) {
> if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
> cancel_delayed_work(&d->mbm_over);
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 29415d023aab..511dab4ffcdc 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -3,6 +3,7 @@
> #define _RESCTRL_H
>
> #include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <linux/pid.h>
> @@ -399,6 +400,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
> }
> }
>
> +/*
> + * Caller must hold the cpuhp read lock to prevent the struct rdt_domain from
struct rdt_domain has since been split into struct rdt_ctrl_domain and struct rdt_mon_domain.
I assume this comment covers both helpers so perhaps this can be "to prevent the domain
from ..."?
> + * being freed.
> + */
> +static inline struct rdt_ctrl_domain *
> +resctrl_get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
> +{
> + struct rdt_ctrl_domain *d;
> +
> + lockdep_assert_cpus_held();
> +
> + list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> + /* Find the domain that contains this CPU */
> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
> + return d;
> + }
> +
> + return NULL;
> +}
> +
> +static inline struct rdt_mon_domain *
> +resctrl_get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
> +{
> + struct rdt_mon_domain *d;
> +
> + lockdep_assert_cpus_held();
> +
> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
> + /* Find the domain that contains this CPU */
> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
> + return d;
> + }
> +
> + return NULL;
> +}
> +
Similar to previous requests, could you please provide a motivation for
the switch to inline?
> /*
> * Update the ctrl_val and apply this config right now.
> * Must be called on one of the domain's CPUs.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2025-02-20 4:08 ` Reinette Chatre
@ 2025-02-27 23:05 ` Fenghua Yu
2025-02-28 19:53 ` James Morse
1 sibling, 0 replies; 135+ messages in thread
From: Fenghua Yu @ 2025-02-27 23:05 UTC (permalink / raw)
To: Reinette Chatre, James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi, James and Reinette,
On 2/19/25 20:08, Reinette Chatre wrote:
> Hi James,
>
> On 2/7/25 10:18 AM, James Morse wrote:
>> get_{mon,ctrl}_domain_from_cpu() are handy helpers that both the arch
>> code and resctrl need to use. Rename them to have a resctrl_ prefix
>> and move them to a header file.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
>> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>> ---
>> Changes from v5:
>> * Added the word from to a comment.
>> ---
>> arch/x86/kernel/cpu/resctrl/core.c | 30 ---------------------
>> arch/x86/kernel/cpu/resctrl/internal.h | 2 --
>> arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
>> include/linux/resctrl.h | 37 ++++++++++++++++++++++++++
>> 5 files changed, 39 insertions(+), 34 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index e4b676879227..921c351d57ae 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -348,36 +348,6 @@ static void cat_wrmsr(struct msr_param *m)
>> wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
>> }
>>
>> -struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
>> -{
>> - struct rdt_ctrl_domain *d;
>> -
>> - lockdep_assert_cpus_held();
>> -
>> - list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
>> - /* Find the domain that contains this CPU */
>> - if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> - return d;
>> - }
>> -
>> - return NULL;
>> -}
>> -
>> -struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
>> -{
>> - struct rdt_mon_domain *d;
>> -
>> - lockdep_assert_cpus_held();
>> -
>> - list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> - /* Find the domain that contains this CPU */
>> - if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> - return d;
>> - }
>> -
>> - return NULL;
>> -}
>> -
>> u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
>> {
>> return resctrl_to_arch_res(r)->num_closid;
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 0d13006e920b..c44c5b496355 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -475,8 +475,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
>> unsigned long cbm);
>> enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
>> int rdtgroup_tasks_assigned(struct rdtgroup *r);
>> -struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
>> -struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
>> int closids_supported(void);
>> void closid_free(int closid);
>> int alloc_rmid(u32 closid);
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index d99a05fc1b44..470cf16f506e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -773,7 +773,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>> if (WARN_ON_ONCE(!pmbm_data))
>> return;
>>
>> - dom_mba = get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
>> + dom_mba = resctrl_get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
>> if (!dom_mba) {
>> pr_warn_once("Failure to get domain for MBA update\n");
>> return;
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 5fc60c9ce28f..08fec23a38bf 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -4274,7 +4274,7 @@ void resctrl_offline_cpu(unsigned int cpu)
>> if (!l3->mon_capable)
>> goto out_unlock;
>>
>> - d = get_mon_domain_from_cpu(cpu, l3);
>> + d = resctrl_get_mon_domain_from_cpu(cpu, l3);
>> if (d) {
>> if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
>> cancel_delayed_work(&d->mbm_over);
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 29415d023aab..511dab4ffcdc 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -3,6 +3,7 @@
>> #define _RESCTRL_H
>>
>> #include <linux/cacheinfo.h>
>> +#include <linux/cpu.h>
>> #include <linux/kernel.h>
>> #include <linux/list.h>
>> #include <linux/pid.h>
>> @@ -399,6 +400,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
>> }
>> }
>>
>> +/*
>> + * Caller must hold the cpuhp read lock to prevent the struct rdt_domain from
> struct rdt_domain has since been split into struct rdt_ctrl_domain and struct rdt_mon_domain.
> I assume this comment covers both helpers so perhaps this can be "to prevent the domain
> from ..."?
>
>> + * being freed.
>> + */
>> +static inline struct rdt_ctrl_domain *
>> +resctrl_get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
>> +{
>> + struct rdt_ctrl_domain *d;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
>> + /* Find the domain that contains this CPU */
>> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> + return d;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static inline struct rdt_mon_domain *
>> +resctrl_get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
>> +{
>> + struct rdt_mon_domain *d;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> + /* Find the domain that contains this CPU */
>> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> + return d;
>> + }
>> +
>> + return NULL;
>> +}
>> +
> Similar to previous requests, could you please provide a motivation for
> the switch to inline?
These two functions are moved from x86 core.c to resctrl.h (same as
restrl_find_domain()).
If motivation is to reduce one file (fs/resctrl/core.c), would it be
better to create fs/resctrl/core.c and host the three functions in the
.c file and remove "inline" in the .h file?
>> /*
>> * Update the ctrl_val and apply this config right now.
>> * Must be called on one of the domain's CPUs.
> Reinette
>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2025-02-20 4:08 ` Reinette Chatre
2025-02-27 23:05 ` Fenghua Yu
@ 2025-02-28 19:53 ` James Morse
1 sibling, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 04:08, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> get_{mon,ctrl}_domain_from_cpu() are handy helpers that both the arch
>> code and resctrl need to use. Rename them to have a resctrl_ prefix
>> and move them to a header file.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 29415d023aab..511dab4ffcdc 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -3,6 +3,7 @@
>> #define _RESCTRL_H
>>
>> #include <linux/cacheinfo.h>
>> +#include <linux/cpu.h>
>> #include <linux/kernel.h>
>> #include <linux/list.h>
>> #include <linux/pid.h>
>> @@ -399,6 +400,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
>> }
>> }
>>
>> +/*
>> + * Caller must hold the cpuhp read lock to prevent the struct rdt_domain from
> struct rdt_domain has since been split into struct rdt_ctrl_domain and struct rdt_mon_domain.
> I assume this comment covers both helpers so perhaps this can be "to prevent the domain
> from ..."?
Makes sense - thanks!
>> + * being freed.
>> + */
>> +static inline struct rdt_ctrl_domain *
>> +resctrl_get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
>> +{
>> + struct rdt_ctrl_domain *d;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
>> + /* Find the domain that contains this CPU */
>> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> + return d;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static inline struct rdt_mon_domain *
>> +resctrl_get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
>> +{
>> + struct rdt_mon_domain *d;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> + /* Find the domain that contains this CPU */
>> + if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
>> + return d;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>
> Similar to previous requests, could you please provide a motivation for
> the switch to inline?
Hmmm, this has diverged over time. x86 is now using its get_domain_id_from_scope() and
resctrl_find_domain() to cover this. I can probably do away with MPAMs use of these...
~
This gets replaced by a patch that moves them to live next to their callers which lets
them be static, and makes the automated move feasible.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (29 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 30/42] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:21 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 32/42] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
` (12 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
From: Amit Singh Tomar <amitsinght@marvell.com>
Resctrl allocates and finds free CLOSID values using the bits of a u32.
This restricts the number of control groups that can be created by
user-space.
MPAM has an architectural limit of 2^16 CLOSID values, Intel x86 could
be extended beyond 32 values. There is at least one MPAM platform which
supports more than 32 CLOSID values.
Replace the fixed size bitmap with calls to the bitmap API to allocate
an array of a sufficient size.
ffs() returns '1' for bit 0, hence the existing code subtracts 1 from
the index to get the CLOSID value. find_first_bit() returns the bit
number which does not need adjusting.
Signed-off-by: Amit Singh Tomar <amitsinght@marvell.com>
[ morse: fixed the off-by-one in the allocator and the wrong
not-found value. Removed the limit. Rephrase the commit message. ]
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
* This patch got pulled into this series.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 45 +++++++++++++++++---------
1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 08fec23a38bf..de79da30d500 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -130,8 +130,8 @@ static bool resctrl_is_mbm_event(int e)
}
/*
- * Trivial allocator for CLOSIDs. Since h/w only supports a small number,
- * we can keep a bitmap of free CLOSIDs in a single integer.
+ * Trivial allocator for CLOSIDs. Use BITMAP APIs to manipulate a bitmap
+ * of free CLOSIDs.
*
* Using a global CLOSID across all resources has some advantages and
* some drawbacks:
@@ -144,7 +144,7 @@ static bool resctrl_is_mbm_event(int e)
* - Our choices on how to configure each resource become progressively more
* limited as the number of resources grows.
*/
-static unsigned long closid_free_map;
+static unsigned long *closid_free_map;
static int closid_free_map_len;
int closids_supported(void)
@@ -152,20 +152,30 @@ int closids_supported(void)
return closid_free_map_len;
}
-static void closid_init(void)
+static int closid_init(void)
{
struct resctrl_schema *s;
- u32 rdt_min_closid = 32;
+ u32 rdt_min_closid = ~0;
/* Compute rdt_min_closid across all resources */
list_for_each_entry(s, &resctrl_schema_all, list)
rdt_min_closid = min(rdt_min_closid, s->num_closid);
- closid_free_map = BIT_MASK(rdt_min_closid) - 1;
+ closid_free_map = bitmap_alloc(rdt_min_closid, GFP_KERNEL);
+ if (!closid_free_map)
+ return -ENOMEM;
+ bitmap_fill(closid_free_map, rdt_min_closid);
/* RESCTRL_RESERVED_CLOSID is always reserved for the default group */
- __clear_bit(RESCTRL_RESERVED_CLOSID, &closid_free_map);
+ __clear_bit(RESCTRL_RESERVED_CLOSID, closid_free_map);
closid_free_map_len = rdt_min_closid;
+
+ return 0;
+}
+
+static void closid_exit(void)
+{
+ bitmap_free(closid_free_map);
}
static int closid_alloc(void)
@@ -182,12 +192,11 @@ static int closid_alloc(void)
return cleanest_closid;
closid = cleanest_closid;
} else {
- closid = ffs(closid_free_map);
- if (closid == 0)
+ closid = find_first_bit(closid_free_map, closid_free_map_len);
+ if (closid == closid_free_map_len)
return -ENOSPC;
- closid--;
}
- __clear_bit(closid, &closid_free_map);
+ __clear_bit(closid, closid_free_map);
return closid;
}
@@ -196,7 +205,7 @@ void closid_free(int closid)
{
lockdep_assert_held(&rdtgroup_mutex);
- __set_bit(closid, &closid_free_map);
+ __set_bit(closid, closid_free_map);
}
/**
@@ -210,7 +219,7 @@ bool closid_allocated(unsigned int closid)
{
lockdep_assert_held(&rdtgroup_mutex);
- return !test_bit(closid, &closid_free_map);
+ return !test_bit(closid, closid_free_map);
}
/**
@@ -2754,20 +2763,22 @@ static int rdt_get_tree(struct fs_context *fc)
goto out_ctx;
}
- closid_init();
+ ret = closid_init();
+ if (ret)
+ goto out_schemata_free;
if (resctrl_arch_mon_capable())
flags |= RFTYPE_MON;
ret = rdtgroup_add_files(rdtgroup_default.kn, flags);
if (ret)
- goto out_schemata_free;
+ goto out_closid_exit;
kernfs_activate(rdtgroup_default.kn);
ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
if (ret < 0)
- goto out_schemata_free;
+ goto out_closid_exit;
if (resctrl_arch_mon_capable()) {
ret = mongroup_create_dir(rdtgroup_default.kn,
@@ -2818,6 +2829,8 @@ static int rdt_get_tree(struct fs_context *fc)
kernfs_remove(kn_mongrp);
out_info:
kernfs_remove(kn_info);
+out_closid_exit:
+ closid_exit();
out_schemata_free:
schemata_list_destroy();
out_ctx:
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID
2025-02-07 18:18 ` [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID James Morse
@ 2025-02-20 4:21 ` Reinette Chatre
2025-02-28 19:53 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:21 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> From: Amit Singh Tomar <amitsinght@marvell.com>
>
> Resctrl allocates and finds free CLOSID values using the bits of a u32.
> This restricts the number of control groups that can be created by
> user-space.
>
> MPAM has an architectural limit of 2^16 CLOSID values, Intel x86 could
> be extended beyond 32 values. There is at least one MPAM platform which
> supports more than 32 CLOSID values.
>
> Replace the fixed size bitmap with calls to the bitmap API to allocate
> an array of a sufficient size.
>
> ffs() returns '1' for bit 0, hence the existing code subtracts 1 from
> the index to get the CLOSID value. find_first_bit() returns the bit
> number which does not need adjusting.
>
> Signed-off-by: Amit Singh Tomar <amitsinght@marvell.com>
> [ morse: fixed the off-by-one in the allocator and the wrong
> not-found value. Removed the limit. Rephrase the commit message. ]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v5:
> * This patch got pulled into this series.
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 45 +++++++++++++++++---------
> 1 file changed, 29 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 08fec23a38bf..de79da30d500 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -130,8 +130,8 @@ static bool resctrl_is_mbm_event(int e)
> }
>
> /*
> - * Trivial allocator for CLOSIDs. Since h/w only supports a small number,
> - * we can keep a bitmap of free CLOSIDs in a single integer.
> + * Trivial allocator for CLOSIDs. Use BITMAP APIs to manipulate a bitmap
> + * of free CLOSIDs.
> *
> * Using a global CLOSID across all resources has some advantages and
> * some drawbacks:
> @@ -144,7 +144,7 @@ static bool resctrl_is_mbm_event(int e)
> * - Our choices on how to configure each resource become progressively more
> * limited as the number of resources grows.
> */
> -static unsigned long closid_free_map;
> +static unsigned long *closid_free_map;
> static int closid_free_map_len;
>
> int closids_supported(void)
> @@ -152,20 +152,30 @@ int closids_supported(void)
> return closid_free_map_len;
> }
>
> -static void closid_init(void)
> +static int closid_init(void)
> {
> struct resctrl_schema *s;
> - u32 rdt_min_closid = 32;
> + u32 rdt_min_closid = ~0;
>
> /* Compute rdt_min_closid across all resources */
> list_for_each_entry(s, &resctrl_schema_all, list)
> rdt_min_closid = min(rdt_min_closid, s->num_closid);
>
> - closid_free_map = BIT_MASK(rdt_min_closid) - 1;
> + closid_free_map = bitmap_alloc(rdt_min_closid, GFP_KERNEL);
> + if (!closid_free_map)
> + return -ENOMEM;
> + bitmap_fill(closid_free_map, rdt_min_closid);
>
> /* RESCTRL_RESERVED_CLOSID is always reserved for the default group */
> - __clear_bit(RESCTRL_RESERVED_CLOSID, &closid_free_map);
> + __clear_bit(RESCTRL_RESERVED_CLOSID, closid_free_map);
> closid_free_map_len = rdt_min_closid;
> +
> + return 0;
> +}
> +
> +static void closid_exit(void)
> +{
> + bitmap_free(closid_free_map);
With closid_free_map being a global, could this also set
closid_free_map to NULL?
> }
>
> static int closid_alloc(void)
> @@ -182,12 +192,11 @@ static int closid_alloc(void)
> return cleanest_closid;
> closid = cleanest_closid;
> } else {
> - closid = ffs(closid_free_map);
> - if (closid == 0)
> + closid = find_first_bit(closid_free_map, closid_free_map_len);
> + if (closid == closid_free_map_len)
> return -ENOSPC;
> - closid--;
> }
> - __clear_bit(closid, &closid_free_map);
> + __clear_bit(closid, closid_free_map);
>
> return closid;
> }
> @@ -196,7 +205,7 @@ void closid_free(int closid)
> {
> lockdep_assert_held(&rdtgroup_mutex);
>
> - __set_bit(closid, &closid_free_map);
> + __set_bit(closid, closid_free_map);
> }
>
> /**
> @@ -210,7 +219,7 @@ bool closid_allocated(unsigned int closid)
> {
> lockdep_assert_held(&rdtgroup_mutex);
>
> - return !test_bit(closid, &closid_free_map);
> + return !test_bit(closid, closid_free_map);
> }
>
> /**
> @@ -2754,20 +2763,22 @@ static int rdt_get_tree(struct fs_context *fc)
> goto out_ctx;
> }
>
> - closid_init();
> + ret = closid_init();
> + if (ret)
> + goto out_schemata_free;
>
> if (resctrl_arch_mon_capable())
> flags |= RFTYPE_MON;
>
> ret = rdtgroup_add_files(rdtgroup_default.kn, flags);
> if (ret)
> - goto out_schemata_free;
> + goto out_closid_exit;
>
> kernfs_activate(rdtgroup_default.kn);
>
> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
> if (ret < 0)
> - goto out_schemata_free;
> + goto out_closid_exit;
>
> if (resctrl_arch_mon_capable()) {
> ret = mongroup_create_dir(rdtgroup_default.kn,
> @@ -2818,6 +2829,8 @@ static int rdt_get_tree(struct fs_context *fc)
> kernfs_remove(kn_mongrp);
> out_info:
> kernfs_remove(kn_info);
> +out_closid_exit:
> + closid_exit();
> out_schemata_free:
> schemata_list_destroy();
> out_ctx:
With closid_init() called from rdt_get_tree() during mount I expected
closid_exit() to be called from rdt_kill_sb() during unmount ?
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID
2025-02-20 4:21 ` Reinette Chatre
@ 2025-02-28 19:53 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi Reinette,
On 20/02/2025 04:21, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> From: Amit Singh Tomar <amitsinght@marvell.com>
>>
>> Resctrl allocates and finds free CLOSID values using the bits of a u32.
>> This restricts the number of control groups that can be created by
>> user-space.
>>
>> MPAM has an architectural limit of 2^16 CLOSID values, Intel x86 could
>> be extended beyond 32 values. There is at least one MPAM platform which
>> supports more than 32 CLOSID values.
>>
>> Replace the fixed size bitmap with calls to the bitmap API to allocate
>> an array of a sufficient size.
>>
>> ffs() returns '1' for bit 0, hence the existing code subtracts 1 from
>> the index to get the CLOSID value. find_first_bit() returns the bit
>> number which does not need adjusting.
>>
>> Signed-off-by: Amit Singh Tomar <amitsinght@marvell.com>
>> [ morse: fixed the off-by-one in the allocator and the wrong
>> not-found value. Removed the limit. Rephrase the commit message. ]
>> Signed-off-by: James Morse <james.morse@arm.com>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 08fec23a38bf..de79da30d500 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -152,20 +152,30 @@ int closids_supported(void)
>> +static void closid_exit(void)
>> +{
>> + bitmap_free(closid_free_map);
> With closid_free_map being a global, could this also set
> closid_free_map to NULL?
Makes sense,
>
>> }
>>
>> static int closid_alloc(void)
>> @@ -2754,20 +2763,22 @@ static int rdt_get_tree(struct fs_context *fc)
>> goto out_ctx;
>> }
>>
>> - closid_init();
>> + ret = closid_init();
>> + if (ret)
>> + goto out_schemata_free;
>>
>> if (resctrl_arch_mon_capable())
>> flags |= RFTYPE_MON;
>>
>> ret = rdtgroup_add_files(rdtgroup_default.kn, flags);
>> if (ret)
>> - goto out_schemata_free;
>> + goto out_closid_exit;
>>
>> kernfs_activate(rdtgroup_default.kn);
>>
>> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
>> if (ret < 0)
>> - goto out_schemata_free;
>> + goto out_closid_exit;
>>
>> if (resctrl_arch_mon_capable()) {
>> ret = mongroup_create_dir(rdtgroup_default.kn,
> With closid_init() called from rdt_get_tree() during mount I expected
> closid_exit() to be called from rdt_kill_sb() during unmount ?
Ah, I'd missed that - the old version got called multiple times but it was harmless. Now
it potentially leaks the bitmap. Fixed.
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 32/42] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (30 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 31/42] x86/resctrl: Remove the limit on the number of CLOSID James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:26 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
` (11 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
resctrl_sched_in() loads the architecture specific CPU MSRs with the
CLOSID and RMID values. This function was named before resctrl was
split to have architecture specific code, and generic filesystem code.
This function is obviously architecture specific, but does not begin
with 'resctrl_arch_', making it the odd one out in the functions an
architecture needs to support to enable resctrl.
Rename it for consistency. This is purely cosmetic.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
arch/x86/include/asm/resctrl.h | 4 ++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++++++------
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 011bf67a1866..7a39728b0743 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -175,7 +175,7 @@ static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored,
return READ_ONCE(tsk->rmid) == rmid;
}
-static inline void resctrl_sched_in(struct task_struct *tsk)
+static inline void resctrl_arch_sched_in(struct task_struct *tsk)
{
if (static_branch_likely(&rdt_enable_key))
__resctrl_sched_in(tsk);
@@ -212,7 +212,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c);
#else
-static inline void resctrl_sched_in(struct task_struct *tsk) {}
+static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
#endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index de79da30d500..6e30283358d4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -371,7 +371,7 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
}
/*
- * This is safe against resctrl_sched_in() called from __switch_to()
+ * This is safe against resctrl_arch_sched_in() called from __switch_to()
* because __switch_to() is executed with interrupts disabled. A local call
* from update_closid_rmid() is protected against __switch_to() because
* preemption is disabled.
@@ -390,7 +390,7 @@ void resctrl_arch_sync_cpu_closid_rmid(void *info)
* executing task might have its own closid selected. Just reuse
* the context switch code.
*/
- resctrl_sched_in(current);
+ resctrl_arch_sched_in(current);
}
/*
@@ -615,7 +615,7 @@ static void _update_task_closid_rmid(void *task)
* Otherwise, the MSR is updated when the task is scheduled in.
*/
if (task == current)
- resctrl_sched_in(task);
+ resctrl_arch_sched_in(task);
}
static void update_task_closid_rmid(struct task_struct *t)
@@ -673,7 +673,7 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
* Ensure the task's closid and rmid are written before determining if
* the task is current that will decide if it will be interrupted.
* This pairs with the full barrier between the rq->curr update and
- * resctrl_sched_in() during context switch.
+ * resctrl_arch_sched_in() during context switch.
*/
smp_mb();
@@ -2978,8 +2978,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
/*
* Order the closid/rmid stores above before the loads
* in task_curr(). This pairs with the full barrier
- * between the rq->curr update and resctrl_sched_in()
- * during context switch.
+ * between the rq->curr update and
+ * resctrl_arch_sched_in() during context switch.
*/
smp_mb();
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0917c7f25720..8697b02dabf1 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -211,7 +211,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
switch_fpu_finish(next_p);
/* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in(next_p);
+ resctrl_arch_sched_in(next_p);
return prev_p;
}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 226472332a70..3f1235d3bf1d 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -707,7 +707,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
}
/* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in(next_p);
+ resctrl_arch_sched_in(next_p);
return prev_p;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 32/42] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
2025-02-07 18:18 ` [PATCH v6 32/42] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
@ 2025-02-20 4:26 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:26 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> resctrl_sched_in() loads the architecture specific CPU MSRs with the
> CLOSID and RMID values. This function was named before resctrl was
> split to have architecture specific code, and generic filesystem code.
>
> This function is obviously architecture specific, but does not begin
> with 'resctrl_arch_', making it the odd one out in the functions an
> architecture needs to support to enable resctrl.
>
> Rename it for consistency. This is purely cosmetic.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (31 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 32/42] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:42 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 34/42] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
` (10 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
resctrl can't be built as a module, and the kernfs helpers are not exported
so this is unlikely to change. MPAM has an error interrupt which indicates
the MPAM driver has gone haywire. Should this occur tasks could run with
the wrong control values, leading to bad performance for important tasks.
The MPAM driver needs a way to tell resctrl that no further configuration
should be attempted.
Using resctrl_exit() for this leaves the system in a funny state as
resctrl is still mounted, but cannot be un-mounted because the sysfs
directory that is typically used has been removed. Dave Martin suggests
this may cause systemd trouble in the future as not all filesystems
can be unmounted.
Add calls to remove all the files and directories in resctrl, and
remove the sysfs_remove_mount_point() call that leaves the system
in a funny state. When triggered, this causes all the resctrl files
to disappear. resctrl can be unmounted, but not mounted again.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Serialise rdtgroup_destroy_root() against umount().
* Check rdtgroup_default.kn to protect against duplicate calls.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6e30283358d4..424622d2f959 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4092,8 +4092,12 @@ static int rdtgroup_setup_root(struct rdt_fs_context *ctx)
static void rdtgroup_destroy_root(void)
{
- kernfs_destroy_root(rdt_root);
- rdtgroup_default.kn = NULL;
+ lockdep_assert_held(&rdtgroup_mutex);
+
+ if (rdtgroup_default.kn) {
+ kernfs_destroy_root(rdt_root);
+ rdtgroup_default.kn = NULL;
+ }
}
static void __init rdtgroup_setup_default(void)
@@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
void __exit resctrl_exit(void)
{
+ mutex_lock(&rdtgroup_mutex);
+ rdtgroup_destroy_root();
+ mutex_unlock(&rdtgroup_mutex);
+
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
- sysfs_remove_mount_point(fs_kobj, "resctrl");
resctrl_mon_resource_exit();
}
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-02-07 18:18 ` [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2025-02-20 4:42 ` Reinette Chatre
2025-02-28 19:54 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:42 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
> resctrl can't be built as a module, and the kernfs helpers are not exported
> so this is unlikely to change. MPAM has an error interrupt which indicates
> the MPAM driver has gone haywire. Should this occur tasks could run with
> the wrong control values, leading to bad performance for important tasks.
> The MPAM driver needs a way to tell resctrl that no further configuration
> should be attempted.
>
> Using resctrl_exit() for this leaves the system in a funny state as
> resctrl is still mounted, but cannot be un-mounted because the sysfs
> directory that is typically used has been removed. Dave Martin suggests
> this may cause systemd trouble in the future as not all filesystems
> can be unmounted.
>
> Add calls to remove all the files and directories in resctrl, and
> remove the sysfs_remove_mount_point() call that leaves the system
> in a funny state. When triggered, this causes all the resctrl files
> to disappear. resctrl can be unmounted, but not mounted again.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> Changes since v5:
> * Serialise rdtgroup_destroy_root() against umount().
> * Check rdtgroup_default.kn to protect against duplicate calls.
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 6e30283358d4..424622d2f959 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4092,8 +4092,12 @@ static int rdtgroup_setup_root(struct rdt_fs_context *ctx)
>
> static void rdtgroup_destroy_root(void)
> {
> - kernfs_destroy_root(rdt_root);
> - rdtgroup_default.kn = NULL;
> + lockdep_assert_held(&rdtgroup_mutex);
> +
> + if (rdtgroup_default.kn) {
> + kernfs_destroy_root(rdt_root);
> + rdtgroup_default.kn = NULL;
> + }
> }
>
> static void __init rdtgroup_setup_default(void)
> @@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
>
Could you please add the kerneldoc you proposed in
https://lore.kernel.org/lkml/f2ecb501-bc65-49a9-903d-80ba1737845f@arm.com/ ?
> void __exit resctrl_exit(void)
> {
> + mutex_lock(&rdtgroup_mutex);
> + rdtgroup_destroy_root();
> + mutex_unlock(&rdtgroup_mutex);
> +
> debugfs_remove_recursive(debugfs_resctrl);
> unregister_filesystem(&rdt_fs_type);
> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>
> resctrl_mon_resource_exit();
> }
It is difficult for me to follow the kernfs reference counting required
to make this work. Specifically, the root kn is "destroyed" here but it
is required to stick around until unmount when the rest of the files
are removed. Have you been able to test this flow? I think you mentioned
something like this before but I cannot find the details now.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-02-20 4:42 ` Reinette Chatre
@ 2025-02-28 19:54 ` James Morse
2025-03-01 2:35 ` Reinette Chatre
0 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-28 19:54 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 04:42, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>> resctrl can't be built as a module, and the kernfs helpers are not exported
>> so this is unlikely to change. MPAM has an error interrupt which indicates
>> the MPAM driver has gone haywire. Should this occur tasks could run with
>> the wrong control values, leading to bad performance for important tasks.
>> The MPAM driver needs a way to tell resctrl that no further configuration
>> should be attempted.
>>
>> Using resctrl_exit() for this leaves the system in a funny state as
>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>> directory that is typically used has been removed. Dave Martin suggests
>> this may cause systemd trouble in the future as not all filesystems
>> can be unmounted.
>>
>> Add calls to remove all the files and directories in resctrl, and
>> remove the sysfs_remove_mount_point() call that leaves the system
>> in a funny state. When triggered, this causes all the resctrl files
>> to disappear. resctrl can be unmounted, but not mounted again.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 6e30283358d4..424622d2f959 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
>>
>
> Could you please add the kerneldoc you proposed in
> https://lore.kernel.org/lkml/f2ecb501-bc65-49a9-903d-80ba1737845f@arm.com/ ?
Huh. The way that is indented means I copied it out the file - I'm not sure went wrong
there. Thanks for fishing out the link!
>> void __exit resctrl_exit(void)
>> {
>> + mutex_lock(&rdtgroup_mutex);
>> + rdtgroup_destroy_root();
>> + mutex_unlock(&rdtgroup_mutex);
>> +
>> debugfs_remove_recursive(debugfs_resctrl);
>> unregister_filesystem(&rdt_fs_type);
>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>>
>> resctrl_mon_resource_exit();
>> }
>
> It is difficult for me to follow the kernfs reference counting required
> to make this work. Specifically, the root kn is "destroyed" here but it
> is required to stick around until unmount when the rest of the files
> are removed.
This drops resctrl's reference to all of the files, which would make the files disappear.
unmount is what calls kernfs_kill_sb(), which gets rid of the root of the filesystem.
> Have you been able to test this flow? I think you mentioned
> something like this before but I cannot find the details now.
Yes:
https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot%2bextras/v6.14-rc1&id=8c96f858b25aa42694c5db56a2afe255ed8262dd
This is a debugfs file that schedules the threaded bit of the MPAM error interrupt
handler. I figure its MPAM specific because there is no way into this code on x86.
(the aim is to get the CI to tickle this)
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-02-28 19:54 ` James Morse
@ 2025-03-01 2:35 ` Reinette Chatre
2025-03-06 19:28 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-03-01 2:35 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/28/25 11:54 AM, James Morse wrote:
> Hi Reinette,
>
> On 20/02/2025 04:42, Reinette Chatre wrote:
>> On 2/7/25 10:18 AM, James Morse wrote:
>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>> the wrong control values, leading to bad performance for important tasks.
>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>> should be attempted.
>>>
>>> Using resctrl_exit() for this leaves the system in a funny state as
>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>> directory that is typically used has been removed. Dave Martin suggests
>>> this may cause systemd trouble in the future as not all filesystems
>>> can be unmounted.
>>>
>>> Add calls to remove all the files and directories in resctrl, and
>>> remove the sysfs_remove_mount_point() call that leaves the system
>>> in a funny state. When triggered, this causes all the resctrl files
>>> to disappear. resctrl can be unmounted, but not mounted again.
>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 6e30283358d4..424622d2f959 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
>>>
>>
>> Could you please add the kerneldoc you proposed in
>> https://lore.kernel.org/lkml/f2ecb501-bc65-49a9-903d-80ba1737845f@arm.com/ ?
>
> Huh. The way that is indented means I copied it out the file - I'm not sure went wrong
> there. Thanks for fishing out the link!
>
>
>>> void __exit resctrl_exit(void)
>>> {
>>> + mutex_lock(&rdtgroup_mutex);
>>> + rdtgroup_destroy_root();
>>> + mutex_unlock(&rdtgroup_mutex);
>>> +
>>> debugfs_remove_recursive(debugfs_resctrl);
>>> unregister_filesystem(&rdt_fs_type);
>>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>>>
>>> resctrl_mon_resource_exit();
>>> }
>>
>> It is difficult for me to follow the kernfs reference counting required
>> to make this work. Specifically, the root kn is "destroyed" here but it
>> is required to stick around until unmount when the rest of the files
>> are removed.
>
> This drops resctrl's reference to all of the files, which would make the files disappear.
> unmount is what calls kernfs_kill_sb(), which gets rid of the root of the filesystem.
My concern is mostly with the kernfs_remove() calls in the rdt_kill_sb()->rmdir_all_sub()
flow. For example:
kernfs_remove(kn_info);
kernfs_remove(kn_mongrp);
kernfs_remove(kn_mondata);
As I understand the above require the destroyed root to still be around.
>
>
>> Have you been able to test this flow? I think you mentioned
>> something like this before but I cannot find the details now.
>
> Yes:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot%2bextras/v6.14-rc1&id=8c96f858b25aa42694c5db56a2afe255ed8262dd
>
> This is a debugfs file that schedules the threaded bit of the MPAM error interrupt
> handler. I figure its MPAM specific because there is no way into this code on x86.
> (the aim is to get the CI to tickle this)
Thank you.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-03-01 2:35 ` Reinette Chatre
@ 2025-03-06 19:28 ` James Morse
2025-03-07 4:47 ` Reinette Chatre
0 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-03-06 19:28 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 01/03/2025 02:35, Reinette Chatre wrote:
> On 2/28/25 11:54 AM, James Morse wrote:
>> On 20/02/2025 04:42, Reinette Chatre wrote:
>>> On 2/7/25 10:18 AM, James Morse wrote:
>>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>>> the wrong control values, leading to bad performance for important tasks.
>>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>>> should be attempted.
>>>>
>>>> Using resctrl_exit() for this leaves the system in a funny state as
>>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>>> directory that is typically used has been removed. Dave Martin suggests
>>>> this may cause systemd trouble in the future as not all filesystems
>>>> can be unmounted.
>>>>
>>>> Add calls to remove all the files and directories in resctrl, and
>>>> remove the sysfs_remove_mount_point() call that leaves the system
>>>> in a funny state. When triggered, this causes all the resctrl files
>>>> to disappear. resctrl can be unmounted, but not mounted again.
>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index 6e30283358d4..424622d2f959 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
>>>>
>>>
>>> Could you please add the kerneldoc you proposed in
>>> https://lore.kernel.org/lkml/f2ecb501-bc65-49a9-903d-80ba1737845f@arm.com/ ?
>>
>> Huh. The way that is indented means I copied it out the file - I'm not sure went wrong
>> there. Thanks for fishing out the link!
>>
>>
>>>> void __exit resctrl_exit(void)
>>>> {
>>>> + mutex_lock(&rdtgroup_mutex);
>>>> + rdtgroup_destroy_root();
>>>> + mutex_unlock(&rdtgroup_mutex);
>>>> +
>>>> debugfs_remove_recursive(debugfs_resctrl);
>>>> unregister_filesystem(&rdt_fs_type);
>>>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>>>>
>>>> resctrl_mon_resource_exit();
>>>> }
>>>
>>> It is difficult for me to follow the kernfs reference counting required
>>> to make this work. Specifically, the root kn is "destroyed" here but it
>>> is required to stick around until unmount when the rest of the files
>>> are removed.
>>
>> This drops resctrl's reference to all of the files, which would make the files disappear.
>> unmount is what calls kernfs_kill_sb(), which gets rid of the root of the filesystem.
>
> My concern is mostly with the kernfs_remove() calls in the rdt_kill_sb()->rmdir_all_sub()
> flow. For example:
> kernfs_remove(kn_info);
> kernfs_remove(kn_mongrp);
> kernfs_remove(kn_mondata);
>
> As I understand the above require the destroyed root to still be around.
Right - because rdt_get_tree() has these global pointers into the hierarchy, but doesn't
take a reference. rmdir_all_sub() relies on always being called before
rdtgroup_destroy_root().
The point hack would be for rdtgroup_destroy_root() to NULL out those global pointers, (I
note they are left dangling) - that would make a subsequent call to rmdir_all_sub() harmless.
A better fix would be to pull out all the filesystem relevant parts from rdt_kill_sb(),
make that safe for multiple calls and get resctrl_exit() to call that.
A call to rdt_kill_sb() after resctrl_exit() would just cleanup the super-block.
This will leave things in a more predictable state.
>>> Have you been able to test this flow? I think you mentioned
>>> something like this before but I cannot find the details now.
>>
>> Yes:
>> https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot%2bextras/v6.14-rc1&id=8c96f858b25aa42694c5db56a2afe255ed8262dd
>>
>> This is a debugfs file that schedules the threaded bit of the MPAM error interrupt
>> handler. I figure its MPAM specific because there is no way into this code on x86.
>> (the aim is to get the CI to tickle this)
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2025-03-06 19:28 ` James Morse
@ 2025-03-07 4:47 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-03-07 4:47 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 3/6/25 11:28 AM, James Morse wrote:
> Hi Reinette,
>
> On 01/03/2025 02:35, Reinette Chatre wrote:
>> On 2/28/25 11:54 AM, James Morse wrote:
>>> On 20/02/2025 04:42, Reinette Chatre wrote:
>>>> On 2/7/25 10:18 AM, James Morse wrote:
>>>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>>>> the wrong control values, leading to bad performance for important tasks.
>>>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>>>> should be attempted.
>>>>>
>>>>> Using resctrl_exit() for this leaves the system in a funny state as
>>>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>>>> directory that is typically used has been removed. Dave Martin suggests
>>>>> this may cause systemd trouble in the future as not all filesystems
>>>>> can be unmounted.
>>>>>
>>>>> Add calls to remove all the files and directories in resctrl, and
>>>>> remove the sysfs_remove_mount_point() call that leaves the system
>>>>> in a funny state. When triggered, this causes all the resctrl files
>>>>> to disappear. resctrl can be unmounted, but not mounted again.
>>>
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> index 6e30283358d4..424622d2f959 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> @@ -4371,9 +4375,12 @@ int __init resctrl_init(void)
>>>>>
>>>>
>>>> Could you please add the kerneldoc you proposed in
>>>> https://lore.kernel.org/lkml/f2ecb501-bc65-49a9-903d-80ba1737845f@arm.com/ ?
>>>
>>> Huh. The way that is indented means I copied it out the file - I'm not sure went wrong
>>> there. Thanks for fishing out the link!
>>>
>>>
>>>>> void __exit resctrl_exit(void)
>>>>> {
>>>>> + mutex_lock(&rdtgroup_mutex);
>>>>> + rdtgroup_destroy_root();
>>>>> + mutex_unlock(&rdtgroup_mutex);
>>>>> +
>>>>> debugfs_remove_recursive(debugfs_resctrl);
>>>>> unregister_filesystem(&rdt_fs_type);
>>>>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>>>>>
>>>>> resctrl_mon_resource_exit();
>>>>> }
>>>>
>>>> It is difficult for me to follow the kernfs reference counting required
>>>> to make this work. Specifically, the root kn is "destroyed" here but it
>>>> is required to stick around until unmount when the rest of the files
>>>> are removed.
>>>
>>> This drops resctrl's reference to all of the files, which would make the files disappear.
>>> unmount is what calls kernfs_kill_sb(), which gets rid of the root of the filesystem.
>>
>> My concern is mostly with the kernfs_remove() calls in the rdt_kill_sb()->rmdir_all_sub()
>> flow. For example:
>> kernfs_remove(kn_info);
>> kernfs_remove(kn_mongrp);
>> kernfs_remove(kn_mondata);
>>
>> As I understand the above require the destroyed root to still be around.
>
> Right - because rdt_get_tree() has these global pointers into the hierarchy, but doesn't
> take a reference. rmdir_all_sub() relies on always being called before
> rdtgroup_destroy_root().
>
> The point hack would be for rdtgroup_destroy_root() to NULL out those global pointers, (I
> note they are left dangling) - that would make a subsequent call to rmdir_all_sub() harmless.
>
> A better fix would be to pull out all the filesystem relevant parts from rdt_kill_sb(),
> make that safe for multiple calls and get resctrl_exit() to call that.
> A call to rdt_kill_sb() after resctrl_exit() would just cleanup the super-block.
> This will leave things in a more predictable state.
>
>
Since V7 has been posted already I try to keep things coherent by copying this
exchange and responded to you there [1].
Reinette
[1] https://lore.kernel.org/lkml/053d8a62-022b-4bf8-8e47-651e7c3a2d59@intel.com/
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 34/42] x86/resctrl: Drop __init/__exit on assorted symbols
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (32 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 33/42] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:46 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 35/42] x86/resctrl: Move is_mba_sc() out of core.c James Morse
` (9 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Because ARM's MPAM controls are probed using MMIO, resctrl can't be
initialised until enough CPUs are online to have determined the
system-wide supported num_closid. Arm64 also supports 'late onlined
secondaries', where only a subset of CPUs are online during boot.
These two combine to mean the MPAM driver may not be able to initialise
resctrl until user-space has brought 'enough' CPUs online.
To allow MPAM to initialise resctrl after __init text has been free'd,
remove all the __init markings from resctrl.
The existing __exit markings cause these functions to be removed by the
linker as it has never been possible to build resctrl as a module. MPAM
has an error interrupt which causes the driver to reset and disable
itself. Remove the __exit markings to allow the MPAM driver to tear down
resctrl when an error occurs.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v4:
* Earlier __init marker removal migrated here.
---
arch/x86/kernel/cpu/resctrl/core.c | 6 +++---
arch/x86/kernel/cpu/resctrl/internal.h | 4 ++--
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 +++---
include/linux/resctrl.h | 6 +++---
5 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 921c351d57ae..31558d7abe54 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -738,7 +738,7 @@ struct rdt_options {
bool force_off, force_on;
};
-static struct rdt_options rdt_options[] __initdata = {
+static struct rdt_options rdt_options[] __ro_after_init = {
RDT_OPT(RDT_FLAG_CMT, "cmt", X86_FEATURE_CQM_OCCUP_LLC),
RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
@@ -778,7 +778,7 @@ static int __init set_rdt_options(char *str)
}
__setup("rdt", set_rdt_options);
-bool __init rdt_cpu_has(int flag)
+bool rdt_cpu_has(int flag)
{
bool ret = boot_cpu_has(flag);
struct rdt_options *o;
@@ -798,7 +798,7 @@ bool __init rdt_cpu_has(int flag)
return ret;
}
-bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
{
if (!rdt_cpu_has(X86_FEATURE_BMEC))
return false;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c44c5b496355..32ed9aeffb90 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -481,13 +481,13 @@ int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
void resctrl_mon_resource_exit(void);
-bool __init rdt_cpu_has(int flag);
+bool rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
cpumask_t *cpumask, int evtid, int first);
-int __init resctrl_mon_resource_init(void);
+int resctrl_mon_resource_init(void);
void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
unsigned long delay_ms,
int exclude_cpu);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 470cf16f506e..a9168913c153 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1191,7 +1191,7 @@ static __init int snc_get_config(void)
*
* Returns 0 for success, or -ENOMEM.
*/
-int __init resctrl_mon_resource_init(void)
+int resctrl_mon_resource_init(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
int ret;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 424622d2f959..2bf69bcfa47a 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4100,7 +4100,7 @@ static void rdtgroup_destroy_root(void)
}
}
-static void __init rdtgroup_setup_default(void)
+static void rdtgroup_setup_default(void)
{
mutex_lock(&rdtgroup_mutex);
@@ -4316,7 +4316,7 @@ void resctrl_offline_cpu(unsigned int cpu)
*
* Return: 0 on success or -errno
*/
-int __init resctrl_init(void)
+int resctrl_init(void)
{
int ret = 0;
@@ -4373,7 +4373,7 @@ int __init resctrl_init(void)
return ret;
}
-void __exit resctrl_exit(void)
+void resctrl_exit(void)
{
mutex_lock(&rdtgroup_mutex);
rdtgroup_destroy_root();
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 511dab4ffcdc..a8ff2cdba2c6 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -359,7 +359,7 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
u32 resctrl_arch_system_num_rmid_idx(void);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
-bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
/**
* resctrl_arch_mon_event_config_write() - Write the config for an event.
@@ -571,7 +571,7 @@ void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
-int __init resctrl_init(void);
-void __exit resctrl_exit(void);
+int resctrl_init(void);
+void resctrl_exit(void);
#endif /* _RESCTRL_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 34/42] x86/resctrl: Drop __init/__exit on assorted symbols
2025-02-07 18:18 ` [PATCH v6 34/42] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
@ 2025-02-20 4:46 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:46 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> Because ARM's MPAM controls are probed using MMIO, resctrl can't be
> initialised until enough CPUs are online to have determined the
> system-wide supported num_closid. Arm64 also supports 'late onlined
> secondaries', where only a subset of CPUs are online during boot.
>
> These two combine to mean the MPAM driver may not be able to initialise
> resctrl until user-space has brought 'enough' CPUs online.
>
> To allow MPAM to initialise resctrl after __init text has been free'd,
> remove all the __init markings from resctrl.
>
> The existing __exit markings cause these functions to be removed by the
> linker as it has never been possible to build resctrl as a module. MPAM
> has an error interrupt which causes the driver to reset and disable
> itself. Remove the __exit markings to allow the MPAM driver to tear down
> resctrl when an error occurs.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 35/42] x86/resctrl: Move is_mba_sc() out of core.c
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (33 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 34/42] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:48 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
` (8 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
is_mba_sc() is defined in core.c, but has no callers there. It does
not access any architecture private structures.
Move this to rdtgroup.c where the majority of callers are. This makes
the move of the filesystem code to /fs/ cleaner.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 15 ---------------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 31558d7abe54..6303c0ee0ae2 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -166,21 +166,6 @@ static inline void cache_alloc_hsw_probe(void)
rdt_alloc_capable = true;
}
-bool is_mba_sc(struct rdt_resource *r)
-{
- if (!r)
- r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
-
- /*
- * The software controller support is only applicable to MBA resource.
- * Make sure to check for resource type.
- */
- if (r->rid != RDT_RESOURCE_MBA)
- return false;
-
- return r->membw.mba_sc;
-}
-
/*
* rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
* exposed to user interface and the h/w understandable delay values.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2bf69bcfa47a..6832ae603db3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1520,6 +1520,21 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
return size;
}
+bool is_mba_sc(struct rdt_resource *r)
+{
+ if (!r)
+ r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
+
+ /*
+ * The software controller support is only applicable to MBA resource.
+ * Make sure to check for resource type.
+ */
+ if (r->rid != RDT_RESOURCE_MBA)
+ return false;
+
+ return r->membw.mba_sc;
+}
+
/*
* rdtgroup_size_show - Display size in bytes of allocated regions
*
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 35/42] x86/resctrl: Move is_mba_sc() out of core.c
2025-02-07 18:18 ` [PATCH v6 35/42] x86/resctrl: Move is_mba_sc() out of core.c James Morse
@ 2025-02-20 4:48 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:48 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> is_mba_sc() is defined in core.c, but has no callers there. It does
> not access any architecture private structures.
>
> Move this to rdtgroup.c where the majority of callers are. This makes
> the move of the filesystem code to /fs/ cleaner.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (34 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 35/42] x86/resctrl: Move is_mba_sc() out of core.c James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 4:50 ` Reinette Chatre
2025-02-27 20:26 ` Moger, Babu
2025-02-07 18:18 ` [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits James Morse
` (7 subsequent siblings)
43 siblings, 2 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The resctrl_event_id enum gives names to the counter event numbers on x86.
These are used directly by resctrl.
To allow the MPAM driver to keep an array of these the size of the enum
needs to be known.
Add a 'num_events' define which can be used to size an array. This isn't
a member of the enum to avoid updating switch statements that would
otherwise be missing a case.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
include/linux/resctrl_types.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 51c51a1aabfb..70226f5ab3e3 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -51,4 +51,6 @@ enum resctrl_event_id {
QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
};
+#define QOS_NUM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID + 1)
+
#endif /* __LINUX_RESCTRL_TYPES_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum
2025-02-07 18:18 ` [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
@ 2025-02-20 4:50 ` Reinette Chatre
2025-02-27 20:26 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 4:50 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> The resctrl_event_id enum gives names to the counter event numbers on x86.
> These are used directly by resctrl.
>
> To allow the MPAM driver to keep an array of these the size of the enum
> needs to be known.
>
> Add a 'num_events' define which can be used to size an array. This isn't
> a member of the enum to avoid updating switch statements that would
> otherwise be missing a case.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum
2025-02-07 18:18 ` [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
2025-02-20 4:50 ` Reinette Chatre
@ 2025-02-27 20:26 ` Moger, Babu
2025-02-28 19:55 ` James Morse
1 sibling, 1 reply; 135+ messages in thread
From: Moger, Babu @ 2025-02-27 20:26 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi James,
On 2/7/25 12:18, James Morse wrote:
> The resctrl_event_id enum gives names to the counter event numbers on x86.
> These are used directly by resctrl.
>
> To allow the MPAM driver to keep an array of these the size of the enum
> needs to be known.
>
> Add a 'num_events' define which can be used to size an array. This isn't
> a member of the enum to avoid updating switch statements that would
> otherwise be missing a case.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> include/linux/resctrl_types.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
> index 51c51a1aabfb..70226f5ab3e3 100644
> --- a/include/linux/resctrl_types.h
> +++ b/include/linux/resctrl_types.h
> @@ -51,4 +51,6 @@ enum resctrl_event_id {
> QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
> };
>
> +#define QOS_NUM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID + 1)
Why cant this be part of "enum resctrl_event_id" like we defined
RDT_NUM_RESOURCES?
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum
2025-02-27 20:26 ` Moger, Babu
@ 2025-02-28 19:55 ` James Morse
2025-02-28 20:59 ` Luck, Tony
0 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-28 19:55 UTC (permalink / raw)
To: babu.moger, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi, D Scott Phillips OS,
carl, lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang,
Jamie Iles, Xin Hao, peternewman, dfustini, amitsinght,
David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Babu,
On 27/02/2025 20:26, Moger, Babu wrote:
> On 2/7/25 12:18, James Morse wrote:
>> The resctrl_event_id enum gives names to the counter event numbers on x86.
>> These are used directly by resctrl.
>>
>> To allow the MPAM driver to keep an array of these the size of the enum
>> needs to be known.
>>
>> Add a 'num_events' define which can be used to size an array. This isn't
>> a member of the enum to avoid updating switch statements that would
>> otherwise be missing a case.
>> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
>> index 51c51a1aabfb..70226f5ab3e3 100644
>> --- a/include/linux/resctrl_types.h
>> +++ b/include/linux/resctrl_types.h
>> @@ -51,4 +51,6 @@ enum resctrl_event_id {
>> QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
>> };
>>
>> +#define QOS_NUM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID + 1)
> Why cant this be part of "enum resctrl_event_id" like we defined
> RDT_NUM_RESOURCES?
Maybe its a difference that only exists in my head, but the rdt resource array is
completely a resctrl concept, the positions in the enum don't mean anything.
Not so for for resctrl_event_id - those numbers mean something to the X86 CPUs. Resctrl
needs some unique identifier for those, and its simpler just to use these directly. I
didn't want to add anything to this enum.
If there are mpam specific events, (currently there is only the risk of bandwidth counters
on the L2, or scattered at random through the system), I'd prefer to support them via perf
and keep them out of here completely.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread* RE: [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum
2025-02-28 19:55 ` James Morse
@ 2025-02-28 20:59 ` Luck, Tony
0 siblings, 0 replies; 135+ messages in thread
From: Luck, Tony @ 2025-02-28 20:59 UTC (permalink / raw)
To: James Morse, babu.moger@amd.com, x86@kernel.org,
linux-kernel@vger.kernel.org
Cc: Chatre, Reinette, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS, carl@os.amperecomputing.com,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
tan.shaopeng@fujitsu.com, baolin.wang@linux.alibaba.com,
Jamie Iles, Xin Hao, peternewman@google.com,
dfustini@baylibre.com, amitsinght@marvell.com, David Hildenbrand,
Rex Nie, Dave Martin, Ko, Koba, Shanker Donthineni, Shaopeng Tan
> >> @@ -51,4 +51,6 @@ enum resctrl_event_id {
> >> QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
> >> };
> >>
> >> +#define QOS_NUM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID + 1)
>
> > Why cant this be part of "enum resctrl_event_id" like we defined
> > RDT_NUM_RESOURCES?
>
> Maybe its a difference that only exists in my head, but the rdt resource array is
> completely a resctrl concept, the positions in the enum don't mean anything.
> Not so for for resctrl_event_id - those numbers mean something to the X86 CPUs. Resctrl
> needs some unique identifier for those, and its simpler just to use these directly. I
> didn't want to add anything to this enum.
>
> If there are mpam specific events, (currently there is only the risk of bandwidth counters
> on the L2, or scattered at random through the system), I'd prefer to support them via perf
> and keep them out of here completely.
Also note that resctrl code has some "switch (evtid) {" code ... if you make QOS_NUM_EVENTS
a member of the enum, then the compiler will warn if you don't have a "default:" or a
"case QOS_NUM_EVENTS:" to cover all the options.
We don't have any "switch (r->rid)"
-Tony
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (35 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 36/42] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 5:40 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 38/42] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
` (6 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
MPAM platforms retrieve the cache-id property from the ACPI PPTT table.
The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes
the domain-id, and is packed into the mon_data_bits union bitfield.
The width of cache-id in this field is 14 bits.
Expanding the union would break 32bit x86 platforms as this union is
stored as the kernfs kn->priv pointer. This saved allocating memory
for the priv data storage.
The firmware on MPAM platforms have used the PPTT cache-id field to
expose the interconnect's id for the cache, which is sparse and uses
more than 14 bits. Use of this id is to enable PCIe direct cache
injection hints. Using this feature with VFIO means the value provided
by the ACPI table should be exposed to user-space.
To support cache-id values greater than 14 bits, convert the
mon_data_bits union to a structure. This is allocated when the kernfs
file is created, and free'd when the monitor directory is rmdir'd.
Readers and writers must hold the rdtgroup_mutex, and readers should
check for a NULL pointer to protect against an open file preventing
the kernfs file from being free'd immediately after the rmdir call.
Signed-off-by: James Morse <james.morse@arm.com>
---
Previously the MPAM tree repainted the cache-id to compact them,
argue-ing there was no other user. With VFIO use of this PCIe feature,
this is no longer an option:
http://inbox.dpdk.org/dev/PH7PR12MB8596BF09963460CEAE17582E82522@PH7PR12MB8596.namprd12.prod.outlook.com/
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++++++-----
arch/x86/kernel/cpu/resctrl/internal.h | 37 +++++++++++------------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 28 ++++++++++++-----
3 files changed, 50 insertions(+), 34 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 032a585293af..0b475e274483 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -645,7 +645,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
u32 resid, evtid, domid;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
- union mon_data_bits md;
+ struct mon_data *md;
int ret = 0;
rdtgrp = rdtgroup_kn_lock_live(of->kn);
@@ -654,17 +654,22 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
goto out;
}
- md.priv = of->kn->priv;
- resid = md.u.rid;
- domid = md.u.domid;
- evtid = md.u.evtid;
+ md = of->kn->priv;
+ if (!md) {
+ ret = -EIO;
+ goto out;
+ }
+
+ resid = md->rid;
+ domid = md->domid;
+ evtid = md->evtid;
r = resctrl_arch_get_resource(resid);
- if (md.u.sum) {
+ if (md->sum) {
/*
* This file requires summing across all domains that share
* the L3 cache id that was provided in the "domid" field of the
- * mon_data_bits union. Search all domains in the resource for
+ * struct mon_data. Search all domains in the resource for
* one that matches this cache id.
*/
list_for_each_entry(d, &r->mon_domains, hdr.list) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 32ed9aeffb90..16c1a391d012 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -103,27 +103,24 @@ struct mon_evt {
};
/**
- * union mon_data_bits - Monitoring details for each event file.
- * @priv: Used to store monitoring event data in @u
- * as kernfs private data.
- * @u.rid: Resource id associated with the event file.
- * @u.evtid: Event id associated with the event file.
- * @u.sum: Set when event must be summed across multiple
- * domains.
- * @u.domid: When @u.sum is zero this is the domain to which
- * the event file belongs. When @sum is one this
- * is the id of the L3 cache that all domains to be
- * summed share.
- * @u: Name of the bit fields struct.
+ * struct mon_data - Monitoring details for each event file.
+ * @rid: Resource id associated with the event file.
+ * @evtid: Event id associated with the event file.
+ * @sum: Set when event must be summed across multiple
+ * domains.
+ * @domid: When @sum is zero this is the domain to which
+ * the event file belongs. When @sum is one this
+ * is the id of the L3 cache that all domains to be
+ * summed share.
+ *
+ * Stored in the kernfs kn->priv field, readers and writers must hold
+ * rdtgroup_mutex.
*/
-union mon_data_bits {
- void *priv;
- struct {
- unsigned int rid : 10;
- enum resctrl_event_id evtid : 7;
- unsigned int sum : 1;
- unsigned int domid : 14;
- } u;
+struct mon_data {
+ unsigned int rid;
+ enum resctrl_event_id evtid;
+ unsigned int sum;
+ unsigned int domid;
};
/**
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6832ae603db3..abebe01447ba 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3113,11 +3113,19 @@ static struct file_system_type rdt_fs_type = {
};
static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
- void *priv)
+ struct mon_data *_priv)
{
struct kernfs_node *kn;
+ struct mon_data *priv;
int ret = 0;
+ lockdep_assert_held(&rdtgroup_mutex);
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+ memcpy(priv, _priv, sizeof(*priv));
+
kn = __kernfs_create_file(parent_kn, name, 0444,
GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
&kf_mondata_ops, priv, NULL, NULL);
@@ -3137,9 +3145,15 @@ static void mon_rmdir_one_subdir(struct kernfs_node *pkn, char *name, char *subn
{
struct kernfs_node *kn;
+ lockdep_assert_held(&rdtgroup_mutex);
+
kn = kernfs_find_and_get(pkn, name);
if (!kn)
return;
+
+ kfree(kn->priv);
+ kn->priv = NULL;
+
kernfs_put(kn);
if (kn->dir.subdirs <= 1)
@@ -3180,19 +3194,19 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
bool do_sum)
{
struct rmid_read rr = {0};
- union mon_data_bits priv;
+ struct mon_data priv;
struct mon_evt *mevt;
int ret;
if (WARN_ON(list_empty(&r->evt_list)))
return -EPERM;
- priv.u.rid = r->rid;
- priv.u.domid = do_sum ? d->ci->id : d->hdr.id;
- priv.u.sum = do_sum;
+ priv.rid = r->rid;
+ priv.domid = do_sum ? d->ci->id : d->hdr.id;
+ priv.sum = do_sum;
list_for_each_entry(mevt, &r->evt_list, list) {
- priv.u.evtid = mevt->evtid;
- ret = mon_addfile(kn, mevt->name, priv.priv);
+ priv.evtid = mevt->evtid;
+ ret = mon_addfile(kn, mevt->name, &priv);
if (ret)
return ret;
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits
2025-02-07 18:18 ` [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits James Morse
@ 2025-02-20 5:40 ` Reinette Chatre
2025-02-28 19:53 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 5:40 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> MPAM platforms retrieve the cache-id property from the ACPI PPTT table.
> The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes
> the domain-id, and is packed into the mon_data_bits union bitfield.
> The width of cache-id in this field is 14 bits.
>
> Expanding the union would break 32bit x86 platforms as this union is
> stored as the kernfs kn->priv pointer. This saved allocating memory
> for the priv data storage.
>
> The firmware on MPAM platforms have used the PPTT cache-id field to
> expose the interconnect's id for the cache, which is sparse and uses
> more than 14 bits. Use of this id is to enable PCIe direct cache
> injection hints. Using this feature with VFIO means the value provided
> by the ACPI table should be exposed to user-space.
>
> To support cache-id values greater than 14 bits, convert the
> mon_data_bits union to a structure. This is allocated when the kernfs
> file is created, and free'd when the monitor directory is rmdir'd.
> Readers and writers must hold the rdtgroup_mutex, and readers should
> check for a NULL pointer to protect against an open file preventing
> the kernfs file from being free'd immediately after the rmdir call.
The last sentence is difficult to parse and took me many reads. I see
two major parts to this statement and if I understand correctly the current
implementation combined with this patch does not support either.
(a) "checking for a NULL pointer from readers"
The reader is rdtgroup_mondata_show() and it starts by calling:
rdtgrp = rdtgroup_kn_lock_live(of->kn);
As I understand, on return of rdtgroup_kn_lock_live() the kernfs node
"of->kn" may no longer exist. This seems to be an issue with current code
also.
Considering this, it seems to me that checking if of->kn->priv is NULL
may be futile if of->kn may no longer exist.
I think this also needs a reference to the data needed by the file or
the data needs to be stashed away before the call to
kernfs_break_active_protection().
(b) "...being free'd immediately after the rmdir call"
I believe this refers to expectation that one task may have the file open
while another removes the resource group directory ("rmdir") with the
assumption that the associated struct mon_data is removed during handling
of rmdir. In this implementation the monitoring data file's struct mon_data
is only removed when a monitoring domain goes offline. That is, when the
resource group remains intact while the monitoring data files associated with
one domain is removed (for example when all CPUs associated with that domain
goes offline). The "rmdir" to remove a resource group does not call this code
(mon_rmdir_one_subdir()), nor does the cleanup of the default resource group's
"kn_mondata".
I am trying to get a handle on the different lifetimes and if I understand
correctly this implementation does not attempt to keep the struct mon_data
accessible as long as the file is open. I do not think I've discovered all the
implications of this yet.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Previously the MPAM tree repainted the cache-id to compact them,
> argue-ing there was no other user. With VFIO use of this PCIe feature,
> this is no longer an option:
> http://inbox.dpdk.org/dev/PH7PR12MB8596BF09963460CEAE17582E82522@PH7PR12MB8596.namprd12.prod.outlook.com/
> ---
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++++++-----
> arch/x86/kernel/cpu/resctrl/internal.h | 37 +++++++++++------------
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 28 ++++++++++++-----
> 3 files changed, 50 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 032a585293af..0b475e274483 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -645,7 +645,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> u32 resid, evtid, domid;
> struct rdtgroup *rdtgrp;
> struct rdt_resource *r;
> - union mon_data_bits md;
> + struct mon_data *md;
> int ret = 0;
>
> rdtgrp = rdtgroup_kn_lock_live(of->kn);
> @@ -654,17 +654,22 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> goto out;
> }
>
> - md.priv = of->kn->priv;
> - resid = md.u.rid;
> - domid = md.u.domid;
> - evtid = md.u.evtid;
> + md = of->kn->priv;
> + if (!md) {
> + ret = -EIO;
> + goto out;
> + }
> +
> + resid = md->rid;
> + domid = md->domid;
> + evtid = md->evtid;
> r = resctrl_arch_get_resource(resid);
>
> - if (md.u.sum) {
> + if (md->sum) {
> /*
> * This file requires summing across all domains that share
> * the L3 cache id that was provided in the "domid" field of the
> - * mon_data_bits union. Search all domains in the resource for
> + * struct mon_data. Search all domains in the resource for
> * one that matches this cache id.
> */
> list_for_each_entry(d, &r->mon_domains, hdr.list) {
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 32ed9aeffb90..16c1a391d012 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -103,27 +103,24 @@ struct mon_evt {
> };
>
> /**
> - * union mon_data_bits - Monitoring details for each event file.
> - * @priv: Used to store monitoring event data in @u
> - * as kernfs private data.
> - * @u.rid: Resource id associated with the event file.
> - * @u.evtid: Event id associated with the event file.
> - * @u.sum: Set when event must be summed across multiple
> - * domains.
> - * @u.domid: When @u.sum is zero this is the domain to which
> - * the event file belongs. When @sum is one this
> - * is the id of the L3 cache that all domains to be
> - * summed share.
> - * @u: Name of the bit fields struct.
> + * struct mon_data - Monitoring details for each event file.
> + * @rid: Resource id associated with the event file.
> + * @evtid: Event id associated with the event file.
> + * @sum: Set when event must be summed across multiple
> + * domains.
> + * @domid: When @sum is zero this is the domain to which
> + * the event file belongs. When @sum is one this
> + * is the id of the L3 cache that all domains to be
> + * summed share.
> + *
> + * Stored in the kernfs kn->priv field, readers and writers must hold
> + * rdtgroup_mutex.
> */
> -union mon_data_bits {
> - void *priv;
> - struct {
> - unsigned int rid : 10;
> - enum resctrl_event_id evtid : 7;
> - unsigned int sum : 1;
> - unsigned int domid : 14;
> - } u;
> +struct mon_data {
> + unsigned int rid;
> + enum resctrl_event_id evtid;
> + unsigned int sum;
> + unsigned int domid;
> };
>
> /**
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 6832ae603db3..abebe01447ba 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -3113,11 +3113,19 @@ static struct file_system_type rdt_fs_type = {
> };
>
> static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
> - void *priv)
> + struct mon_data *_priv)
> {
> struct kernfs_node *kn;
> + struct mon_data *priv;
> int ret = 0;
>
> + lockdep_assert_held(&rdtgroup_mutex);
> +
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> + memcpy(priv, _priv, sizeof(*priv));
> +
> kn = __kernfs_create_file(parent_kn, name, 0444,
> GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
> &kf_mondata_ops, priv, NULL, NULL);
> @@ -3137,9 +3145,15 @@ static void mon_rmdir_one_subdir(struct kernfs_node *pkn, char *name, char *subn
> {
> struct kernfs_node *kn;
>
> + lockdep_assert_held(&rdtgroup_mutex);
> +
> kn = kernfs_find_and_get(pkn, name);
> if (!kn)
> return;
> +
> + kfree(kn->priv);
> + kn->priv = NULL;
> +
> kernfs_put(kn);
>
> if (kn->dir.subdirs <= 1)
> @@ -3180,19 +3194,19 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
> bool do_sum)
> {
> struct rmid_read rr = {0};
> - union mon_data_bits priv;
> + struct mon_data priv;
> struct mon_evt *mevt;
> int ret;
>
> if (WARN_ON(list_empty(&r->evt_list)))
> return -EPERM;
>
> - priv.u.rid = r->rid;
> - priv.u.domid = do_sum ? d->ci->id : d->hdr.id;
> - priv.u.sum = do_sum;
> + priv.rid = r->rid;
> + priv.domid = do_sum ? d->ci->id : d->hdr.id;
> + priv.sum = do_sum;
> list_for_each_entry(mevt, &r->evt_list, list) {
> - priv.u.evtid = mevt->evtid;
> - ret = mon_addfile(kn, mevt->name, priv.priv);
> + priv.evtid = mevt->evtid;
> + ret = mon_addfile(kn, mevt->name, &priv);
> if (ret)
> return ret;
>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits
2025-02-20 5:40 ` Reinette Chatre
@ 2025-02-28 19:53 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi Reinette,
On 20/02/2025 05:40, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> MPAM platforms retrieve the cache-id property from the ACPI PPTT table.
>> The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes
>> the domain-id, and is packed into the mon_data_bits union bitfield.
>> The width of cache-id in this field is 14 bits.
>>
>> Expanding the union would break 32bit x86 platforms as this union is
>> stored as the kernfs kn->priv pointer. This saved allocating memory
>> for the priv data storage.
>>
>> The firmware on MPAM platforms have used the PPTT cache-id field to
>> expose the interconnect's id for the cache, which is sparse and uses
>> more than 14 bits. Use of this id is to enable PCIe direct cache
>> injection hints. Using this feature with VFIO means the value provided
>> by the ACPI table should be exposed to user-space.
>>
>> To support cache-id values greater than 14 bits, convert the
>> mon_data_bits union to a structure. This is allocated when the kernfs
>> file is created, and free'd when the monitor directory is rmdir'd.
>> Readers and writers must hold the rdtgroup_mutex, and readers should
>> check for a NULL pointer to protect against an open file preventing
>> the kernfs file from being free'd immediately after the rmdir call.
> The last sentence is difficult to parse and took me many reads. I see
> two major parts to this statement and if I understand correctly the current
> implementation combined with this patch does not support either.
> (a) "checking for a NULL pointer from readers"
> The reader is rdtgroup_mondata_show() and it starts by calling:
> rdtgrp = rdtgroup_kn_lock_live(of->kn);
> As I understand, on return of rdtgroup_kn_lock_live() the kernfs node
> "of->kn" may no longer exist. This seems to be an issue with current code
> also.
> Considering this, it seems to me that checking if of->kn->priv is NULL
> may be futile if of->kn may no longer exist.
Certainly true.
Because the lifetime is different to the existing pointer-abuse version, I just added the
checks to be on the safe side.
I'll rip this out.
> I think this also needs a reference to the data needed by the file or
> the data needs to be stashed away before the call to
> kernfs_break_active_protection().
I've tried to hit this problem, and been unable. I'm happy to write it off as theoretical.
In particular:
* rmdir a control group while holding the mbm_local_bytes file open for reading. Any read
after the parent control group has been destroyed gets -ENODEV, even though though
/proc/<pid>/fd shows the fd as open for reading. The kernel in question had lockdep and
kasan enabled)
* take all the CPUs in a domain offline while holding the mbm_local_bytes file open for
reading. Again, read attempts get -ENODEV.
> (b) "...being free'd immediately after the rmdir call"
> I believe this refers to expectation that one task may have the file open
> while another removes the resource group directory ("rmdir") with the
> assumption that the associated struct mon_data is removed during handling
> of rmdir.
This is what I was worried about - and it seemed worth chucking in a NULL check just in
case. Trying a bit harder to hit it - it now seems theoretical.
> In this implementation the monitoring data file's struct mon_data
> is only removed when a monitoring domain goes offline.
> That is, when the
> resource group remains intact while the monitoring data files associated with
> one domain is removed (for example when all CPUs associated with that domain
> goes offline). The "rmdir" to remove a resource group does not call this code
> (mon_rmdir_one_subdir()), nor does the cleanup of the default resource group's
> "kn_mondata".
Huh, its the path via user-space calling rmdir() that I was worried about. I hadn't
spotted that there are two of these and they aren't joined up!
This would leak the priv pointer when the user-space path via rmdir() just leaves the
cleanup to kernfs.
Fixing this produces even more spaghetti as domain-offline manipulates one domain in all
rdtgroup, whereas rmdir manipulates all domains in on rdtgroup. Its going to be noisy to
merge these two paths.
A simpler approach is to use the event kn->priv pointers in the default control group as
the canonical copy, which also saves memory. For mbm_total in a domain, every control and
monitor group has the same values in struct mon_data_bits - the RMID is found by walking
up the tree to find the struct rdtgroup.
As user-space can't rmdir the default control group, we only need to free it for
domain-offline, when we know all the files for that domain are going to be removed - which
we can rely on to avoid doing it in a particular order.
> I am trying to get a handle on the different lifetimes and if I understand
> correctly this implementation does not attempt to keep the struct mon_data
> accessible as long as the file is open.
No, but I think that concern is theoretical...
> I do not think I've discovered all the implications of this yet.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 38/42] x86/resctrl: Remove a newline to avoid confusing the code move script
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (36 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 37/42] x86/restrl: Expand the width of dom_id by replacing mon_data_bits James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 5:42 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 39/42] x86/resctrl: Split trace.h James Morse
` (5 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
The resctrl filesystem code will shortly be moved to /fs/. This involves
splitting all the existing files, with some functions remaining under
arch/x86, and others moving to fs/resctrl.
To make this reproducible, a python script does the heavy lif^W
copy-and-paste. This involves some clunky parsing of C code.
The parser gets confused by the newline after this #ifdef.
Just remove it.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index abebe01447ba..ebf17bcbd095 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -892,7 +892,6 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
}
#ifdef CONFIG_PROC_CPU_RESCTRL
-
/*
* A task can only be part of one resctrl control group and of one monitor
* group which is associated to that control group.
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 38/42] x86/resctrl: Remove a newline to avoid confusing the code move script
2025-02-07 18:18 ` [PATCH v6 38/42] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
@ 2025-02-20 5:42 ` Reinette Chatre
0 siblings, 0 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 5:42 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> The resctrl filesystem code will shortly be moved to /fs/. This involves
> splitting all the existing files, with some functions remaining under
> arch/x86, and others moving to fs/resctrl.
>
> To make this reproducible, a python script does the heavy lif^W
> copy-and-paste. This involves some clunky parsing of C code.
>
> The parser gets confused by the newline after this #ifdef.
> Just remove it.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (37 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 38/42] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 5:45 ` Reinette Chatre
2025-02-27 23:16 ` Fenghua Yu
2025-02-07 18:18 ` [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code James Morse
` (4 subsequent siblings)
43 siblings, 2 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
trace.h contains all the tracepoints. After the move to /fs/resctrl, some
of these will be left behind. All the pseudo_lock tracepoints remain part
of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
Split trace.h so that each C file includes a different trace header file.
This means the trace header files are not modified when they are moved.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
arch/x86/kernel/cpu/resctrl/Makefile | 3 ++
arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++-
arch/x86/kernel/cpu/resctrl/monitor_trace.h | 31 +++++++++++++++++++
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +-
.../resctrl/{trace.h => pseudo_lock_trace.h} | 24 +++-----------
5 files changed, 42 insertions(+), 22 deletions(-)
create mode 100644 arch/x86/kernel/cpu/resctrl/monitor_trace.h
rename arch/x86/kernel/cpu/resctrl/{trace.h => pseudo_lock_trace.h} (56%)
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index 0c13b0befd8a..909be78ec6da 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -2,4 +2,7 @@
obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
+
+# To allow define_trace.h's recursive include:
CFLAGS_pseudo_lock.o = -I$(src)
+CFLAGS_monitor.o = -I$(src)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index a9168913c153..6acfbd3ad007 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -26,7 +26,9 @@
#include <asm/resctrl.h>
#include "internal.h"
-#include "trace.h"
+
+#define CREATE_TRACE_POINTS
+#include "monitor_trace.h"
/**
* struct rmid_entry - dirty tracking for all RMID.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor_trace.h b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
new file mode 100644
index 000000000000..ade67daf42c2
--- /dev/null
+++ b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM resctrl
+
+#if !defined(_FS_RESCTRL_MONITOR_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _FS_RESCTRL_MONITOR_TRACE_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(mon_llc_occupancy_limbo,
+ TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
+ TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
+ TP_STRUCT__entry(__field(u32, ctrl_hw_id)
+ __field(u32, mon_hw_id)
+ __field(int, domain_id)
+ __field(u64, llc_occupancy_bytes)),
+ TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
+ __entry->mon_hw_id = mon_hw_id;
+ __entry->domain_id = domain_id;
+ __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
+ TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
+ __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
+ __entry->llc_occupancy_bytes)
+ );
+
+#endif /* _FS_RESCTRL_MONITOR_TRACE_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE monitor_trace
+#include <trace/define_trace.h>
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index e7f713eb4490..9eda0abbd29d 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -30,7 +30,7 @@
#include "internal.h"
#define CREATE_TRACE_POINTS
-#include "trace.h"
+#include "pseudo_lock_trace.h"
/*
* The bits needed to disable hardware prefetching varies based on the
diff --git a/arch/x86/kernel/cpu/resctrl/trace.h b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
similarity index 56%
rename from arch/x86/kernel/cpu/resctrl/trace.h
rename to arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
index 2a506316b303..5a0fae61d3ee 100644
--- a/arch/x86/kernel/cpu/resctrl/trace.h
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
@@ -2,8 +2,8 @@
#undef TRACE_SYSTEM
#define TRACE_SYSTEM resctrl
-#if !defined(_TRACE_RESCTRL_H) || defined(TRACE_HEADER_MULTI_READ)
-#define _TRACE_RESCTRL_H
+#if !defined(_X86_RESCTRL_PSEUDO_LOCK_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _X86_RESCTRL_PSEUDO_LOCK_TRACE_H
#include <linux/tracepoint.h>
@@ -35,25 +35,9 @@ TRACE_EVENT(pseudo_lock_l3,
TP_printk("hits=%llu miss=%llu",
__entry->l3_hits, __entry->l3_miss));
-TRACE_EVENT(mon_llc_occupancy_limbo,
- TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
- TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
- TP_STRUCT__entry(__field(u32, ctrl_hw_id)
- __field(u32, mon_hw_id)
- __field(int, domain_id)
- __field(u64, llc_occupancy_bytes)),
- TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
- __entry->mon_hw_id = mon_hw_id;
- __entry->domain_id = domain_id;
- __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
- TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
- __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
- __entry->llc_occupancy_bytes)
- );
-
-#endif /* _TRACE_RESCTRL_H */
+#endif /* _X86_RESCTRL_PSEUDO_LOCK_TRACE_H */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace
+#define TRACE_INCLUDE_FILE pseudo_lock_trace
#include <trace/define_trace.h>
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-07 18:18 ` [PATCH v6 39/42] x86/resctrl: Split trace.h James Morse
@ 2025-02-20 5:45 ` Reinette Chatre
2025-02-25 4:36 ` Fenghua Yu
2025-02-27 23:16 ` Fenghua Yu
1 sibling, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 5:45 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
> of these will be left behind. All the pseudo_lock tracepoints remain part
> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>
> Split trace.h so that each C file includes a different trace header file.
> This means the trace header files are not modified when they are moved.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
I did not investigate if this originates here or after the code move but
when compiling the series (after running the file move script) with W=1
I see the following:
In file included from /home/reinette/dev/linux/include/trace/trace_events.h:27,
from /home/reinette/dev/linux/include/trace/define_trace.h:113,
from /home/reinette/dev/linux/arch/x86/kernel/cpu/resctrl/monitor_trace.h:17,
from /home/reinette/dev/linux/arch/x86/kernel/cpu/resctrl/monitor.c:32:
/home/reinette/dev/linux/include/trace/stages/init.h:2:23: warning: ‘str__resctrl__trace_system_name’ defined but not used [-Wunused-const-variable=]
2 | #define __app__(x, y) str__##x##y
| ^~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:3:21: note: in expansion of macro ‘__app__’
3 | #define __app(x, y) __app__(x, y)
| ^~~~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:5:29: note: in expansion of macro ‘__app’
5 | #define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
| ^~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:8:27: note: in expansion of macro ‘TRACE_SYSTEM_STRING’
8 | static const char TRACE_SYSTEM_STRING[] = \
| ^~~~~~~~~~~~~~~~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:11:1: note: in expansion of macro ‘TRACE_MAKE_SYSTEM_STR’
11 | TRACE_MAKE_SYSTEM_STR();
| ^~~~~~~~~~~~~~~~~~~~~
[SNIP]
In file included from /home/reinette/dev/linux/include/trace/trace_events.h:27,
from /home/reinette/dev/linux/include/trace/define_trace.h:113,
from /home/reinette/dev/linux/fs/resctrl/pseudo_lock_trace.h:17,
from /home/reinette/dev/linux/fs/resctrl/pseudo_lock.c:34:
/home/reinette/dev/linux/include/trace/stages/init.h:2:23: warning: ‘str__resctrl__trace_system_name’ defined but not used [-Wunused-const-variable=]
2 | #define __app__(x, y) str__##x##y
| ^~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:3:21: note: in expansion of macro ‘__app__’
3 | #define __app(x, y) __app__(x, y)
| ^~~~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:5:29: note: in expansion of macro ‘__app’
5 | #define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
| ^~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:8:27: note: in expansion of macro ‘TRACE_SYSTEM_STRING’
8 | static const char TRACE_SYSTEM_STRING[] = \
| ^~~~~~~~~~~~~~~~~~~
/home/reinette/dev/linux/include/trace/stages/init.h:11:1: note: in expansion of macro ‘TRACE_MAKE_SYSTEM_STR’
11 | TRACE_MAKE_SYSTEM_STR();
| ^~~~~~~~~~~~~~~~~~~~~
[SNIP]
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-20 5:45 ` Reinette Chatre
@ 2025-02-25 4:36 ` Fenghua Yu
2025-02-28 19:53 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Fenghua Yu @ 2025-02-25 4:36 UTC (permalink / raw)
To: Reinette Chatre, James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi, Reinette and James,
On 2/19/25 21:45, Reinette Chatre wrote:
> Hi James,
>
> On 2/7/25 10:18 AM, James Morse wrote:
>> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
>> of these will be left behind. All the pseudo_lock tracepoints remain part
>> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>>
>> Split trace.h so that each C file includes a different trace header file.
>> This means the trace header files are not modified when they are moved.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>> ---
> I did not investigate if this originates here or after the code move but
> when compiling the series (after running the file move script) with W=1
The issues happen after running the move script.
It's because no trace event is defined in fs/resctrl/pseudo_lock_trace.h
or arch/x86/kernel/cpu/resctrl/monitor_trace.h.
One way to fix them is to add empty events in the trace files. But seems
that may cause the script difficulty because it cannot handle empty
events easily.
Another way is to remove the two files and their inclusions in .c files.
Please see my comment and fix in patch #42.
> I see the following:
>
> In file included from /home/reinette/dev/linux/include/trace/trace_events.h:27,
> from /home/reinette/dev/linux/include/trace/define_trace.h:113,
> from /home/reinette/dev/linux/arch/x86/kernel/cpu/resctrl/monitor_trace.h:17,
> from /home/reinette/dev/linux/arch/x86/kernel/cpu/resctrl/monitor.c:32:
> /home/reinette/dev/linux/include/trace/stages/init.h:2:23: warning: ‘str__resctrl__trace_system_name’ defined but not used [-Wunused-const-variable=]
> 2 | #define __app__(x, y) str__##x##y
> | ^~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:3:21: note: in expansion of macro ‘__app__’
> 3 | #define __app(x, y) __app__(x, y)
> | ^~~~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:5:29: note: in expansion of macro ‘__app’
> 5 | #define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
> | ^~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:8:27: note: in expansion of macro ‘TRACE_SYSTEM_STRING’
> 8 | static const char TRACE_SYSTEM_STRING[] = \
> | ^~~~~~~~~~~~~~~~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:11:1: note: in expansion of macro ‘TRACE_MAKE_SYSTEM_STR’
> 11 | TRACE_MAKE_SYSTEM_STR();
> | ^~~~~~~~~~~~~~~~~~~~~
> [SNIP]
> In file included from /home/reinette/dev/linux/include/trace/trace_events.h:27,
> from /home/reinette/dev/linux/include/trace/define_trace.h:113,
> from /home/reinette/dev/linux/fs/resctrl/pseudo_lock_trace.h:17,
> from /home/reinette/dev/linux/fs/resctrl/pseudo_lock.c:34:
> /home/reinette/dev/linux/include/trace/stages/init.h:2:23: warning: ‘str__resctrl__trace_system_name’ defined but not used [-Wunused-const-variable=]
> 2 | #define __app__(x, y) str__##x##y
> | ^~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:3:21: note: in expansion of macro ‘__app__’
> 3 | #define __app(x, y) __app__(x, y)
> | ^~~~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:5:29: note: in expansion of macro ‘__app’
> 5 | #define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
> | ^~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:8:27: note: in expansion of macro ‘TRACE_SYSTEM_STRING’
> 8 | static const char TRACE_SYSTEM_STRING[] = \
> | ^~~~~~~~~~~~~~~~~~~
> /home/reinette/dev/linux/include/trace/stages/init.h:11:1: note: in expansion of macro ‘TRACE_MAKE_SYSTEM_STR’
> 11 | TRACE_MAKE_SYSTEM_STR();
> | ^~~~~~~~~~~~~~~~~~~~~
>
> [SNIP]
>
> Reinette
Thanks.
-Fenghua
>
>
> From mboxrd@z Thu Jan 1 00:00:00 1970
> Received: from foss.arm.com (foss.arm.com [217.140.110.172])
> by smtp.subspace.kernel.org (Postfix) with ESMTP id CBED71AF4E9
> for <linux-kernel@vger.kernel.org>; Fri, 7 Feb 2025 18:21:02 +0000 (UTC)
> Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172
> ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
> t=1738952472; cv=none; b=PFrOGcCGM+MjrdBzD6HmKJG/UiOsBugPbKMsqC2F57JloaI12vsfJ6MvmkRWrY6qiP/OJUu0TOyQpGWHpn5aRfBOYww5b+87lSnRBQRdrF+KXxLTyqMVd1nH4aZdUDrvcaZ6VG7GPcBcDERY8rqliD0ML1je6nefzSBMGoE0+DI=
> ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
> s=arc-20240116; t=1738952472; c=relaxed/simple;
> bh=34UocWVSN3dGfK9ddb8MDRo3AU4bVW3Pvwz5MGjnN30=;
> h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
> MIME-Version; b=edzmQCT8CRaz1N9z0j5OnDawAxdTiDx7vz1PxIqaf5ANscFYEuEcKlijvFuk1ENpYAU9jyXuAwVX4dQlp2AMWVwCTWurQln2bvF/4lWLn82uB1BR2FokzzzUo8n5w4Dyn8koLUwzNlk9a3U0TjKO23gs1LFoqLoOlDqXLQHleeA=
> ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172
> Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com
> Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com
> Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
> by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 829C71F37;
> Fri, 7 Feb 2025 10:21:23 -0800 (PST)
> Received: from merodach.members.linode.com (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19])
> by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 64EA33F63F;
> Fri, 7 Feb 2025 10:20:57 -0800 (PST)
> From: James Morse <james.morse@arm.com>
> To: x86@kernel.org,
> linux-kernel@vger.kernel.org
> Cc: Reinette Chatre <reinette.chatre@intel.com>,
> Thomas Gleixner <tglx@linutronix.de>,
> Ingo Molnar <mingo@redhat.com>,
> Borislav Petkov <bp@alien8.de>,
> H Peter Anvin <hpa@zytor.com>,
> Babu Moger <Babu.Moger@amd.com>,
> James Morse <james.morse@arm.com>,
> shameerali.kolothum.thodi@huawei.com,
> D Scott Phillips OS <scott@os.amperecomputing.com>,
> carl@os.amperecomputing.com,
> lcherian@marvell.com,
> bobo.shaobowang@huawei.com,
> tan.shaopeng@fujitsu.com,
> baolin.wang@linux.alibaba.com,
> Jamie Iles <quic_jiles@quicinc.com>,
> Xin Hao <xhao@linux.alibaba.com>,
> peternewman@google.com,
> dfustini@baylibre.com,
> amitsinght@marvell.com,
> David Hildenbrand <david@redhat.com>,
> Rex Nie <rex.nie@jaguarmicro.com>,
> Dave Martin <dave.martin@arm.com>,
> Koba Ko <kobak@nvidia.com>,
> Shanker Donthineni <sdonthineni@nvidia.com>,
> Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>,
> Tony Luck <tony.luck@intel.com>
> Subject: [PATCH v6 39/42] x86/resctrl: Split trace.h
> Date: Fri, 7 Feb 2025 18:18:20 +0000
> Message-Id: <20250207181823.6378-40-james.morse@arm.com>
> X-Mailer: git-send-email 2.20.1
> In-Reply-To: <20250207181823.6378-1-james.morse@arm.com>
> References: <20250207181823.6378-1-james.morse@arm.com>
> Precedence: bulk
> X-Mailing-List: linux-kernel@vger.kernel.org
> List-Id: <linux-kernel.vger.kernel.org>
> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
> MIME-Version: 1.0
> Content-Transfer-Encoding: 8bit
>
> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
> of these will be left behind. All the pseudo_lock tracepoints remain part
> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>
> Split trace.h so that each C file includes a different trace header file.
> This means the trace header files are not modified when they are moved.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> arch/x86/kernel/cpu/resctrl/Makefile | 3 ++
> arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++-
> arch/x86/kernel/cpu/resctrl/monitor_trace.h | 31 +++++++++++++++++++
> arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +-
> .../resctrl/{trace.h => pseudo_lock_trace.h} | 24 +++-----------
> 5 files changed, 42 insertions(+), 22 deletions(-)
> create mode 100644 arch/x86/kernel/cpu/resctrl/monitor_trace.h
> rename arch/x86/kernel/cpu/resctrl/{trace.h => pseudo_lock_trace.h} (56%)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
> index 0c13b0befd8a..909be78ec6da 100644
> --- a/arch/x86/kernel/cpu/resctrl/Makefile
> +++ b/arch/x86/kernel/cpu/resctrl/Makefile
> @@ -2,4 +2,7 @@
> obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
> obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
> obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
> +
> +# To allow define_trace.h's recursive include:
> CFLAGS_pseudo_lock.o = -I$(src)
> +CFLAGS_monitor.o = -I$(src)
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index a9168913c153..6acfbd3ad007 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -26,7 +26,9 @@
> #include <asm/resctrl.h>
>
> #include "internal.h"
> -#include "trace.h"
> +
> +#define CREATE_TRACE_POINTS
> +#include "monitor_trace.h"
>
> /**
> * struct rmid_entry - dirty tracking for all RMID.
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor_trace.h b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
> new file mode 100644
> index 000000000000..ade67daf42c2
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM resctrl
> +
> +#if !defined(_FS_RESCTRL_MONITOR_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _FS_RESCTRL_MONITOR_TRACE_H
> +
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(mon_llc_occupancy_limbo,
> + TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
> + TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
> + TP_STRUCT__entry(__field(u32, ctrl_hw_id)
> + __field(u32, mon_hw_id)
> + __field(int, domain_id)
> + __field(u64, llc_occupancy_bytes)),
> + TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
> + __entry->mon_hw_id = mon_hw_id;
> + __entry->domain_id = domain_id;
> + __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
> + TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
> + __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
> + __entry->llc_occupancy_bytes)
> + );
> +
> +#endif /* _FS_RESCTRL_MONITOR_TRACE_H */
> +
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH .
> +#define TRACE_INCLUDE_FILE monitor_trace
> +#include <trace/define_trace.h>
> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> index e7f713eb4490..9eda0abbd29d 100644
> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> @@ -30,7 +30,7 @@
> #include "internal.h"
>
> #define CREATE_TRACE_POINTS
> -#include "trace.h"
> +#include "pseudo_lock_trace.h"
>
> /*
> * The bits needed to disable hardware prefetching varies based on the
> diff --git a/arch/x86/kernel/cpu/resctrl/trace.h b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
> similarity index 56%
> rename from arch/x86/kernel/cpu/resctrl/trace.h
> rename to arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
> index 2a506316b303..5a0fae61d3ee 100644
> --- a/arch/x86/kernel/cpu/resctrl/trace.h
> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
> @@ -2,8 +2,8 @@
> #undef TRACE_SYSTEM
> #define TRACE_SYSTEM resctrl
>
> -#if !defined(_TRACE_RESCTRL_H) || defined(TRACE_HEADER_MULTI_READ)
> -#define _TRACE_RESCTRL_H
> +#if !defined(_X86_RESCTRL_PSEUDO_LOCK_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _X86_RESCTRL_PSEUDO_LOCK_TRACE_H
>
> #include <linux/tracepoint.h>
>
> @@ -35,25 +35,9 @@ TRACE_EVENT(pseudo_lock_l3,
> TP_printk("hits=%llu miss=%llu",
> __entry->l3_hits, __entry->l3_miss));
>
> -TRACE_EVENT(mon_llc_occupancy_limbo,
> - TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
> - TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
> - TP_STRUCT__entry(__field(u32, ctrl_hw_id)
> - __field(u32, mon_hw_id)
> - __field(int, domain_id)
> - __field(u64, llc_occupancy_bytes)),
> - TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
> - __entry->mon_hw_id = mon_hw_id;
> - __entry->domain_id = domain_id;
> - __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
> - TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
> - __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
> - __entry->llc_occupancy_bytes)
> - );
> -
> -#endif /* _TRACE_RESCTRL_H */
> +#endif /* _X86_RESCTRL_PSEUDO_LOCK_TRACE_H */
>
> #undef TRACE_INCLUDE_PATH
> #define TRACE_INCLUDE_PATH .
> -#define TRACE_INCLUDE_FILE trace
> +#define TRACE_INCLUDE_FILE pseudo_lock_trace
> #include <trace/define_trace.h>
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-25 4:36 ` Fenghua Yu
@ 2025-02-28 19:53 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: Fenghua Yu, Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Fenghua, Reinette,
On 25/02/2025 04:36, Fenghua Yu wrote:
> * # Be careful, this email looks suspicious; * Out of Character: The sender is exhibiting
> a significant deviation from their usual behavior, this may indicate that their account
> has been compromised. Be extra cautious before opening links or attachments. *
> Hi, Reinette and James,
>
> On 2/19/25 21:45, Reinette Chatre wrote:
>> Hi James,
>>
>> On 2/7/25 10:18 AM, James Morse wrote:
>>> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
>>> of these will be left behind. All the pseudo_lock tracepoints remain part
>>> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>>>
>>> Split trace.h so that each C file includes a different trace header file.
>>> This means the trace header files are not modified when they are moved.
>> I did not investigate if this originates here or after the code move but
>> when compiling the series (after running the file move script) with W=1
> The issues happen after running the move script.
>
> It's because no trace event is defined in fs/resctrl/pseudo_lock_trace.h or arch/x86/
> kernel/cpu/resctrl/monitor_trace.h.
>
> One way to fix them is to add empty events in the trace files. But seems that may cause
> the script difficulty because it cannot handle empty events easily.
>
> Another way is to remove the two files and their inclusions in .c files. Please see my
> comment and fix in patch #42.
Yup, I have a followup patch that does this:
https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.14-rc1&id=3d0430324a0c7e7ad765140f9e78a9a312a13573
I assumed this was harmless, evidently it has some way of upsetting kbuild.
I'll post the version with the followup patches so they can be reviewed and squashed together.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-07 18:18 ` [PATCH v6 39/42] x86/resctrl: Split trace.h James Morse
2025-02-20 5:45 ` Reinette Chatre
@ 2025-02-27 23:16 ` Fenghua Yu
2025-02-28 19:53 ` James Morse
1 sibling, 1 reply; 135+ messages in thread
From: Fenghua Yu @ 2025-02-27 23:16 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
On 2/7/25 10:18, James Morse wrote:
> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
> of these will be left behind. All the pseudo_lock tracepoints remain part
> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>
> Split trace.h so that each C file includes a different trace header file.
> This means the trace header files are not modified when they are moved.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
Since this patch itself doesn't cause the errors when W=1 and doesn't
appear any issue to me,
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 39/42] x86/resctrl: Split trace.h
2025-02-27 23:16 ` Fenghua Yu
@ 2025-02-28 19:53 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:53 UTC (permalink / raw)
To: Fenghua Yu, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Fenghua,
On 27/02/2025 23:16, Fenghua Yu wrote:
> On 2/7/25 10:18, James Morse wrote:
>> trace.h contains all the tracepoints. After the move to /fs/resctrl, some
>> of these will be left behind. All the pseudo_lock tracepoints remain part
>> of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
>>
>> Split trace.h so that each C file includes a different trace header file.
>> This means the trace header files are not modified when they are moved.
> Since this patch itself doesn't cause the errors when W=1 and doesn't appear any issue to me,
>
> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Thanks!
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (38 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 39/42] x86/resctrl: Split trace.h James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 5:54 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
` (3 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
for the common parts of the resctrl interface and make X86_CPU_RESCTRL
select this.
Adding an include of asm/resctrl.h to linux/resctrl.h allows the
/fs/resctrl files to switch over to using this header instead.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Removed double include.
* Removed duplicate pseudo lock Kconfig define
* Grammar fix and closid capitalisation change.
Changes since v4:
* Tweaking of the commit message.
Changes since v3:
* Reworded 'if unsure say N' from the Kconfig text, the user doesn't have
the choice anyway at this point.
* Added PWD to monitor.o's CFLAGS for the ftrace rube-goldberg build machine.
* Added split trace files.
Changes since v2:
* Dropped KERNFS dependency from arch side Kconfig.
* Added empty trace.h file.
* Merged asm->linux includes from Dave's patch to decouple those
patches from this series.
Changes since v1:
* Rename new file psuedo_lock.c to pseudo_lock.c, to match the name
of the original file (and to be less surprising).
* [Whitespace only] Under RESCTRL_FS in fs/resctrl/Kconfig, delete
alignment space in orphaned select ... if (which has nothing to line
up with any more).
* [Whitespace only] Reflow and re-tab Kconfig additions.
---
MAINTAINERS | 1 +
arch/Kconfig | 8 +++++
arch/x86/Kconfig | 11 ++-----
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/resctrl/Kconfig | 37 +++++++++++++++++++++++
fs/resctrl/Makefile | 6 ++++
fs/resctrl/ctrlmondata.c | 0
fs/resctrl/internal.h | 0
fs/resctrl/monitor.c | 0
fs/resctrl/monitor_trace.h | 0
fs/resctrl/pseudo_lock.c | 0
fs/resctrl/pseudo_lock_trace.h | 0
fs/resctrl/rdtgroup.c | 0
include/linux/resctrl.h | 4 +++
19 files changed, 64 insertions(+), 13 deletions(-)
create mode 100644 fs/resctrl/Kconfig
create mode 100644 fs/resctrl/Makefile
create mode 100644 fs/resctrl/ctrlmondata.c
create mode 100644 fs/resctrl/internal.h
create mode 100644 fs/resctrl/monitor.c
create mode 100644 fs/resctrl/monitor_trace.h
create mode 100644 fs/resctrl/pseudo_lock.c
create mode 100644 fs/resctrl/pseudo_lock_trace.h
create mode 100644 fs/resctrl/rdtgroup.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 314b9a2ebe20..437d6e05f286 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19836,6 +19836,7 @@ S: Supported
F: Documentation/arch/x86/resctrl*
F: arch/x86/include/asm/resctrl.h
F: arch/x86/kernel/cpu/resctrl/
+F: fs/resctrl/
F: include/linux/resctrl*.h
F: tools/testing/selftests/resctrl/
diff --git a/arch/Kconfig b/arch/Kconfig
index b8a4ff365582..2778a7859c11 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1518,6 +1518,14 @@ config STRICT_MODULE_RWX
config ARCH_HAS_PHYS_TO_DMA
bool
+config ARCH_HAS_CPU_RESCTRL
+ bool
+ help
+ An architecture selects this option to indicate that the necessary
+ hooks are provided to support the common memory system usage
+ monitoring and control interfaces provided by the 'resctrl'
+ filesystem (see RESCTRL_FS).
+
config HAVE_ARCH_COMPILER_H
bool
help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 41dda57c4953..d88eb8c3c838 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -503,8 +503,9 @@ config X86_MPPARSE
config X86_CPU_RESCTRL
bool "x86 CPU resource control support"
depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
- select KERNFS
- select PROC_CPU_RESCTRL if PROC_FS
+ depends on MISC_FILESYSTEMS
+ select ARCH_HAS_CPU_RESCTRL
+ select RESCTRL_FS
select RESCTRL_FS_PSEUDO_LOCK
help
Enable x86 CPU resource control support.
@@ -522,12 +523,6 @@ config X86_CPU_RESCTRL
Say N if unsure.
-config RESCTRL_FS_PSEUDO_LOCK
- bool
- help
- Software mechanism to pin data in a cache portion using
- micro-architecture specific knowledge.
-
config X86_FRED
bool "Flexible Return and Event Delivery"
depends on X86_64
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 16c1a391d012..ee50b7375717 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -9,8 +9,6 @@
#include <linux/jump_label.h>
#include <linux/tick.h>
-#include <asm/resctrl.h>
-
#define L3_QOS_CDP_ENABLE 0x01ULL
#define L2_QOS_CDP_ENABLE 0x01ULL
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 6acfbd3ad007..8e3fbfa10f52 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -19,11 +19,11 @@
#include <linux/cpu.h>
#include <linux/module.h>
+#include <linux/resctrl.h>
#include <linux/sizes.h>
#include <linux/slab.h>
#include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
#include "internal.h"
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 9eda0abbd29d..56b7faceebd4 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -18,12 +18,12 @@
#include <linux/mman.h>
#include <linux/perf_event.h>
#include <linux/pm_qos.h>
+#include <linux/resctrl.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
#include <asm/cacheflush.h>
#include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
#include <asm/perf_event.h>
#include "../../events/perf_event.h" /* For X86_CONFIG() */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ebf17bcbd095..eaf933b823fa 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -18,6 +18,7 @@
#include <linux/fs_parser.h>
#include <linux/sysfs.h>
#include <linux/kernfs.h>
+#include <linux/resctrl.h>
#include <linux/seq_buf.h>
#include <linux/seq_file.h>
#include <linux/sched/signal.h>
@@ -28,7 +29,6 @@
#include <uapi/linux/magic.h>
-#include <asm/resctrl.h>
#include "internal.h"
DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
diff --git a/fs/Kconfig b/fs/Kconfig
index 64d420e3c475..709e4d6656e2 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -334,6 +334,7 @@ source "fs/omfs/Kconfig"
source "fs/hpfs/Kconfig"
source "fs/qnx4/Kconfig"
source "fs/qnx6/Kconfig"
+source "fs/resctrl/Kconfig"
source "fs/romfs/Kconfig"
source "fs/pstore/Kconfig"
source "fs/sysv/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 15df0a923d3a..73512f13e969 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -129,3 +129,4 @@ obj-$(CONFIG_EROFS_FS) += erofs/
obj-$(CONFIG_VBOXSF_FS) += vboxsf/
obj-$(CONFIG_ZONEFS_FS) += zonefs/
obj-$(CONFIG_BPF_LSM) += bpf_fs_kfuncs.o
+obj-$(CONFIG_RESCTRL_FS) += resctrl/
diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
new file mode 100644
index 000000000000..229ca71a8258
--- /dev/null
+++ b/fs/resctrl/Kconfig
@@ -0,0 +1,37 @@
+config RESCTRL_FS
+ bool "CPU Resource Control Filesystem (resctrl)"
+ depends on ARCH_HAS_CPU_RESCTRL
+ select KERNFS
+ select PROC_CPU_RESCTRL if PROC_FS
+ help
+ Some architectures provide hardware facilities to group tasks and
+ monitor and control their usage of memory system resources such as
+ caches and memory bandwidth. Examples of such facilities include
+ Intel's Resource Director Technology (Intel(R) RDT) and AMD's
+ Platform Quality of Service (AMD QoS).
+
+ If your system has the necessary support and you want to be able to
+ assign tasks to groups and manipulate the associated resource
+ monitors and controls from userspace, say Y here to get a mountable
+ 'resctrl' filesystem that lets you do just that.
+
+ If nothing mounts or prods the 'resctrl' filesystem, resource
+ controls and monitors are left in a quiescent, permissive state.
+
+ On architectures where this can be disabled independently, it is
+ safe to say N.
+
+ See <file:Documentation/arch/x86/resctrl.rst> for more information.
+
+config RESCTRL_FS_PSEUDO_LOCK
+ bool
+ help
+ Software mechanism to pin data in a cache portion using
+ micro-architecture specific knowledge.
+
+config RESCTRL_RMID_DEPENDS_ON_CLOSID
+ bool
+ help
+ Enabled by the architecture when the RMID values depend on the CLOSID.
+ This causes the CLOSID allocator to search for CLOSID with clean
+ RMID.
diff --git a/fs/resctrl/Makefile b/fs/resctrl/Makefile
new file mode 100644
index 000000000000..e67f34d2236a
--- /dev/null
+++ b/fs/resctrl/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_RESCTRL_FS) += rdtgroup.o ctrlmondata.o monitor.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
+
+# To allow define_trace.h's recursive include:
+CFLAGS_monitor.o = -I$(src)
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/monitor_trace.h b/fs/resctrl/monitor_trace.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/pseudo_lock_trace.h b/fs/resctrl/pseudo_lock_trace.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index a8ff2cdba2c6..ef0802cd5c45 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -9,6 +9,10 @@
#include <linux/pid.h>
#include <linux/resctrl_types.h>
+#ifdef CONFIG_ARCH_HAS_CPU_RESCTRL
+#include <asm/resctrl.h>
+#endif
+
/* CLOSID, RMID value used by the default control group */
#define RESCTRL_RESERVED_CLOSID 0
#define RESCTRL_RESERVED_RMID 0
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code
2025-02-07 18:18 ` [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2025-02-20 5:54 ` Reinette Chatre
2025-02-28 19:54 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 5:54 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
> select this.
>
> Adding an include of asm/resctrl.h to linux/resctrl.h allows the
> /fs/resctrl files to switch over to using this header instead.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
...
> diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
> new file mode 100644
> index 000000000000..229ca71a8258
> --- /dev/null
> +++ b/fs/resctrl/Kconfig
> @@ -0,0 +1,37 @@
> +config RESCTRL_FS
> + bool "CPU Resource Control Filesystem (resctrl)"
> + depends on ARCH_HAS_CPU_RESCTRL
> + select KERNFS
> + select PROC_CPU_RESCTRL if PROC_FS
> + help
> + Some architectures provide hardware facilities to group tasks and
> + monitor and control their usage of memory system resources such as
> + caches and memory bandwidth. Examples of such facilities include
> + Intel's Resource Director Technology (Intel(R) RDT) and AMD's
> + Platform Quality of Service (AMD QoS).
> +
> + If your system has the necessary support and you want to be able to
> + assign tasks to groups and manipulate the associated resource
> + monitors and controls from userspace, say Y here to get a mountable
> + 'resctrl' filesystem that lets you do just that.
> +
> + If nothing mounts or prods the 'resctrl' filesystem, resource
> + controls and monitors are left in a quiescent, permissive state.
> +
> + On architectures where this can be disabled independently, it is
> + safe to say N.
> +
> + See <file:Documentation/arch/x86/resctrl.rst> for more information.
> +
> +config RESCTRL_FS_PSEUDO_LOCK
> + bool
> + help
> + Software mechanism to pin data in a cache portion using
> + micro-architecture specific knowledge.
> +
> +config RESCTRL_RMID_DEPENDS_ON_CLOSID
> + bool
> + help
> + Enabled by the architecture when the RMID values depend on the CLOSID.
> + This causes the CLOSID allocator to search for CLOSID with clean
> + RMID.
With RESCTRL_FS_PSEUDO_LOCK and RESCTRL_RMID_DEPENDS_ON_CLOSID appearing at
same level as RESCTRL_FS all three configs "depends on MISC_FILESYSTEMS".
Should RESCTRL_FS_PSEUDO_LOCK and RESCTRL_RMID_DEPENDS_ON_CLOSID
"depends on RESCTRL_FS" instead?
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code
2025-02-20 5:54 ` Reinette Chatre
@ 2025-02-28 19:54 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:54 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 05:54, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
>> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
>> select this.
>>
>> Adding an include of asm/resctrl.h to linux/resctrl.h allows the
>> /fs/resctrl files to switch over to using this header instead.
>> diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
>> new file mode 100644
>> index 000000000000..229ca71a8258
>> --- /dev/null
>> +++ b/fs/resctrl/Kconfig
>> @@ -0,0 +1,37 @@
>> +config RESCTRL_FS
>> + bool "CPU Resource Control Filesystem (resctrl)"
>> + depends on ARCH_HAS_CPU_RESCTRL
>> + select KERNFS
>> + select PROC_CPU_RESCTRL if PROC_FS
>> + help
>> + Some architectures provide hardware facilities to group tasks and
>> + monitor and control their usage of memory system resources such as
>> + caches and memory bandwidth. Examples of such facilities include
>> + Intel's Resource Director Technology (Intel(R) RDT) and AMD's
>> + Platform Quality of Service (AMD QoS).
>> +
>> + If your system has the necessary support and you want to be able to
>> + assign tasks to groups and manipulate the associated resource
>> + monitors and controls from userspace, say Y here to get a mountable
>> + 'resctrl' filesystem that lets you do just that.
>> +
>> + If nothing mounts or prods the 'resctrl' filesystem, resource
>> + controls and monitors are left in a quiescent, permissive state.
>> +
>> + On architectures where this can be disabled independently, it is
>> + safe to say N.
>> +
>> + See <file:Documentation/arch/x86/resctrl.rst> for more information.
>> +
>> +config RESCTRL_FS_PSEUDO_LOCK
>> + bool
>> + help
>> + Software mechanism to pin data in a cache portion using
>> + micro-architecture specific knowledge.
>> +
>> +config RESCTRL_RMID_DEPENDS_ON_CLOSID
>> + bool
>> + help
>> + Enabled by the architecture when the RMID values depend on the CLOSID.
>> + This causes the CLOSID allocator to search for CLOSID with clean
>> + RMID.
>
> With RESCTRL_FS_PSEUDO_LOCK and RESCTRL_RMID_DEPENDS_ON_CLOSID appearing at
> same level as RESCTRL_FS all three configs "depends on MISC_FILESYSTEMS".
> Should RESCTRL_FS_PSEUDO_LOCK and RESCTRL_RMID_DEPENDS_ON_CLOSID
> "depends on RESCTRL_FS" instead?
Sure, it can't hurt.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (39 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 40/42] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 6:00 ` Reinette Chatre
2025-02-07 18:18 ` [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
` (2 subsequent siblings)
43 siblings, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Dave Martin, Shaopeng Tan, Tony Luck
Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
on definitions in x86's internal.h.
Move definitions in internal.h that need to be shared between the
filesystem and architecture code to header files that fs/resctrl can
include.
Doing this separately means the filesystem code only moves between files
of the same name, instead of having these changes mixed in too.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v3:
* Changed the number of hyphens at the end of the commit message.
Changes since v2:
* Dropped the rfflags and some other defines from being moved.
Changes since v1:
* Revert apparently unintentional duplication of a couple of variable
declarations in <linux/resctrl.h>.
No functional change.
---
arch/x86/include/asm/resctrl.h | 3 +++
arch/x86/kernel/cpu/resctrl/core.c | 5 +++++
arch/x86/kernel/cpu/resctrl/internal.h | 9 ---------
include/linux/resctrl_types.h | 3 +++
4 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 7a39728b0743..6eb7d5c94c7a 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -210,6 +210,9 @@ int resctrl_arch_measure_l2_residency(void *_plr);
int resctrl_arch_measure_l3_residency(void *_plr);
void resctrl_cpu_detect(struct cpuinfo_x86 *c);
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l);
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+
#else
static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6303c0ee0ae2..f2cd7ba39fcc 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -288,6 +288,11 @@ static void rdt_get_cdp_l2_config(void)
rdt_get_cdp_config(RDT_RESOURCE_L2);
}
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
+{
+ return rdt_resources_all[l].cdp_enabled;
+}
+
static void mba_wrmsr_amd(struct msr_param *m)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(m->dom);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index ee50b7375717..a569e31d75f6 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -16,8 +16,6 @@
#define CQM_LIMBOCHECK_INTERVAL 1000
#define MBM_CNTR_WIDTH_BASE 24
-#define MBM_OVERFLOW_INTERVAL 1000
-#define MAX_MBA_BW 100u
#define MBA_IS_LINEAR 0x4
#define MBM_CNTR_WIDTH_OFFSET_AMD 20
@@ -403,13 +401,6 @@ extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
extern enum resctrl_event_id mba_mbps_default_event;
-static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
-{
- return rdt_resources_all[l].cdp_enabled;
-}
-
-int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
-
void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 70226f5ab3e3..b84a6e0834a7 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -7,6 +7,9 @@
#ifndef __LINUX_RESCTRL_TYPES_H
#define __LINUX_RESCTRL_TYPES_H
+#define MAX_MBA_BW 100u
+#define MBM_OVERFLOW_INTERVAL 1000
+
/* Reads to Local DRAM Memory */
#define READS_TO_LOCAL_MEM BIT(0)
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
2025-02-07 18:18 ` [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
@ 2025-02-20 6:00 ` Reinette Chatre
2025-02-28 19:57 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 6:00 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
> on definitions in x86's internal.h.
>
> Move definitions in internal.h that need to be shared between the
> filesystem and architecture code to header files that fs/resctrl can
> include.
>
> Doing this separately means the filesystem code only moves between files
> of the same name, instead of having these changes mixed in too.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
..
> ---
> arch/x86/include/asm/resctrl.h | 3 +++
> arch/x86/kernel/cpu/resctrl/core.c | 5 +++++
> arch/x86/kernel/cpu/resctrl/internal.h | 9 ---------
> include/linux/resctrl_types.h | 3 +++
> 4 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
> index 7a39728b0743..6eb7d5c94c7a 100644
> --- a/arch/x86/include/asm/resctrl.h
> +++ b/arch/x86/include/asm/resctrl.h
> @@ -210,6 +210,9 @@ int resctrl_arch_measure_l2_residency(void *_plr);
> int resctrl_arch_measure_l3_residency(void *_plr);
> void resctrl_cpu_detect(struct cpuinfo_x86 *c);
>
> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l);
> +int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
> +
> #else
>
> static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 6303c0ee0ae2..f2cd7ba39fcc 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -288,6 +288,11 @@ static void rdt_get_cdp_l2_config(void)
> rdt_get_cdp_config(RDT_RESOURCE_L2);
> }
>
> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
> +{
> + return rdt_resources_all[l].cdp_enabled;
> +}
> +
This moves resctrl_arch_get_cdp_enabled() to arch/x86/kernel/cpu/resctrl/core.c
while resctrl_arch_set_cdp_enabled() is already in arch/x86/kernel/cpu/resctrl/rdtgroup.c.
Most of resctrl_arch_get_cdp_enabled()'s callers are
in arch/x86/kernel/cpu/resctrl/rdtgroup.c so it seems appropriate to keep it with
its partner resctrl_arch_set_cdp_enabled()?
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
2025-02-20 6:00 ` Reinette Chatre
@ 2025-02-28 19:57 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:57 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 20/02/2025 06:00, Reinette Chatre wrote:
> On 2/7/25 10:18 AM, James Morse wrote:
>> Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
>> on definitions in x86's internal.h.
>>
>> Move definitions in internal.h that need to be shared between the
>> filesystem and architecture code to header files that fs/resctrl can
>> include.
>>
>> Doing this separately means the filesystem code only moves between files
>> of the same name, instead of having these changes mixed in too.
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 6303c0ee0ae2..f2cd7ba39fcc 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -288,6 +288,11 @@ static void rdt_get_cdp_l2_config(void)
>> rdt_get_cdp_config(RDT_RESOURCE_L2);
>> }
>>
>> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
>> +{
>> + return rdt_resources_all[l].cdp_enabled;
>> +}
>> +
>
> This moves resctrl_arch_get_cdp_enabled() to arch/x86/kernel/cpu/resctrl/core.c
> while resctrl_arch_set_cdp_enabled() is already in arch/x86/kernel/cpu/resctrl/rdtgroup.c.
> Most of resctrl_arch_get_cdp_enabled()'s callers are
> in arch/x86/kernel/cpu/resctrl/rdtgroup.c so it seems appropriate to keep it with
> its partner resctrl_arch_set_cdp_enabled()?
Yup - that would make more sense.
(This will date back to when I was moving the code around by hand every release!)
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (40 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 41/42] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
@ 2025-02-07 18:18 ` James Morse
2025-02-20 6:10 ` Reinette Chatre
2025-02-25 5:02 ` Fenghua Yu
2025-02-10 17:24 ` [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
2025-02-28 1:15 ` Shaopeng Tan (Fujitsu)
43 siblings, 2 replies; 135+ messages in thread
From: James Morse @ 2025-02-07 18:18 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, James Morse, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
To support more than one architecture resctrl needs to move from arch/x86
to live under fs. Moving all the code breaks any series on the mailing
list, so needs scheduling carefully.
Maintaining the patch that moves all this code has proved labour intensive.
It's also near-impossible to review that no inadvertent changes have
crept in.
To solve these problems, temporarily add a hacky python program that
lists all the functions that should move, and those that should stay.
No attempt to parse C code is made, this thing tries to name 'blocks'
based on hueristics about the kernel coding style. It's fragile, but
good enough for its single use here.
This only exists to show I have nothing up my sleeve.
I don't suggested this gets merged.
The patch this script generaets has the following corner cases:
* The original files are regenerated, which will add newlines that are
not present in the original file.
* An trace-point header file the only contains boiler-plate is created
in the arch and filesystem code. The parser doesn't know how to remove
the includes for these - but its easy to 'keep' the file contents on
the correct side. A follow-up patch will remove these files and their
includes.
* asm/cpu_device_id.h and a relative path for 'X86_CONFIG()' are kept
in the filesystem code to ensure x86 builds. A follow-up patch will
remove these.
* This script doesn't know how to move the documentation, and update the
links in comments. A follow-up patch does this.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
Changes since v5:
* Regex twiddling to match surprise tabs.
* Added else case to avoid double output of #defines.
---
resctrl_copy_pasta.py | 812 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 812 insertions(+)
create mode 100644 resctrl_copy_pasta.py
diff --git a/resctrl_copy_pasta.py b/resctrl_copy_pasta.py
new file mode 100644
index 000000000000..589eeeb7a9fd
--- /dev/null
+++ b/resctrl_copy_pasta.py
@@ -0,0 +1,812 @@
+#!/usr/bin/python
+import sys;
+import os;
+import re;
+
+############
+
+SRC_DIR = "arch/x86/kernel/cpu/resctrl";
+DST_DIR = "fs/resctrl";
+
+resctrl_files = [
+ "ctrlmondata.c",
+ "internal.h",
+ "monitor.c",
+ "pseudo_lock.c",
+ "rdtgroup.c",
+ "pseudo_lock_trace.h",
+ "monitor_trace.h",
+];
+
+functions_to_keep = [
+ # common
+ "pr_fmt",
+
+ # core.c
+ "domain_list_lock",
+ "resctrl_arch_late_init",
+ "resctrl_arch_exit",
+ "resctrl_cpu_detect",
+ "rdt_cpu_has",
+ "resctrl_arch_is_evt_configurable",
+ "get_mem_config",
+ "get_slow_mem_config",
+ "get_rdt_alloc_resources",
+ "get_rdt_mon_resources",
+ "__check_quirks_intel",
+ "check_quirks",
+ "get_rdt_resources",
+ "rdt_init_res_defs_intel",
+ "rdt_init_res_defs_amd",
+ "rdt_init_res_defs",
+ "resctrl_cpu_detect",
+ "resctrl_arch_late_init",
+ "resctrl_arch_exit",
+ "setup_default_ctrlval",
+ "domain_free",
+ "domain_setup_ctrlval",
+ "arch_domain_mbm_alloc",
+ "domain_add_cpu",
+ "domain_remove_cpu",
+ "clear_closid_rmid",
+ "resctrl_arch_online_cpu",
+ "resctrl_arch_offline_cpu",
+ "resctrl_arch_get_num_closid",
+ "rdt_ctrl_update",
+ "domain_init",
+ "resctrl_arch_get_resource",
+ "cache_alloc_hsw_probe",
+ "rdt_get_mb_table",
+ "__get_mem_config_intel",
+ "__rdt_get_mem_config_amd",
+ "rdt_get_cache_alloc_cfg",
+ "rdt_get_cdp_config",
+ "rdt_get_cdp_l3_config",
+ "rdt_get_cdp_l2_config",
+ "resctrl_arch_get_cdp_enabled",
+ "set_rdt_options",
+ "pqr_state",
+ "rdt_resources_all",
+ "delay_bw_map",
+ "rdt_options",
+ "cat_wrmsr",
+ "mba_wrmsr_amd",
+ "mba_wrmsr_intel",
+ "anonymous-enum",
+ "rdt_find_domain",
+ "rdt_alloc_capable",
+ "rdt_online",
+ "RDT_OPT",
+
+ # ctrlmon.c
+ "apply_config",
+ "resctrl_arch_update_one",
+ "resctrl_arch_update_domains",
+ "resctrl_arch_get_config",
+
+ # internal.h
+ "L3_QOS_CDP_ENABLE",
+ "L2_QOS_CDP_ENABLE",
+ "MBM_CNTR_WIDTH_BASE",
+ "MBA_IS_LINEAR",
+ "MBM_CNTR_WIDTH_OFFSET_AMD",
+ "RMID_VAL_ERROR",
+ "RMID_VAL_UNAVAIL",
+ "MBM_CNTR_WIDTH_OFFSET_MAX",
+ "arch_mbm_state",
+ "rdt_hw_ctrl_domain",
+ "rdt_hw_mon_domain",
+ "resctrl_to_arch_ctrl_dom",
+ "resctrl_to_arch_mon_dom",
+ "msr_param",
+ "rdt_hw_resource",
+ "resctrl_to_arch_res",
+ "rdt_resources_all",
+ "resctrl_inc",
+ "for_each_rdt_resource",
+ "for_each_capable_rdt_resource",
+ "for_each_alloc_capable_rdt_resource",
+ "for_each_mon_capable_rdt_resource",
+ "arch_mon_domain_online",
+ "cpuid_0x10_1_eax",
+ "cpuid_0x10_3_eax",
+ "cpuid_0x10_x_ecx",
+ "cpuid_0x10_x_edx",
+ "rdt_ctrl_update",
+ "rdt_get_mon_l3_config",
+ "rdt_cpu_has",
+ "intel_rdt_mbm_apply_quirk",
+ "rdt_domain_reconfigure_cdp",
+
+ # monitor.c
+ "rdt_mon_capable",
+ "rdt_mon_features",
+ "CF",
+ "snc_nodes_per_l3_cache",
+ "mbm_cf_table",
+ "mbm_cf_rmidthreshold",
+ "mbm_cf",
+ "logical_rmid_to_physical_rmid",
+ "__rmid_read_phys",
+ "get_corrected_mbm_count",
+ "__rmid_read",
+ "get_arch_mbm_state",
+ "resctrl_arch_reset_rmid",
+ "resctrl_arch_reset_rmid_all",
+ "mbm_overflow_count",
+ "resctrl_arch_rmid_read",
+ "snc_cpu_ids",
+ "snc_get_config",
+ "rdt_get_mon_l3_config",
+ "intel_rdt_mbm_apply_quirk",
+
+ # pseudo_lock.c
+ "prefetch_disable_bits",
+ "resctrl_arch_get_prefetch_disable_bits",
+ "resctrl_arch_pseudo_lock_fn",
+ "resctrl_arch_measure_cycles_lat_fn",
+ "perf_miss_attr",
+ "perf_hit_attr",
+ "residency_counts",
+ "measure_residency_fn",
+ "resctrl_arch_measure_l2_residency",
+ "resctrl_arch_measure_l3_residency",
+
+ # rdtgroup.c
+ "rdt_enable_key",
+ "rdt_mon_enable_key",
+ "rdt_alloc_enable_key",
+ "resctrl_arch_sync_cpu_closid_rmid",
+ "INVALID_CONFIG_INDEX",
+ "mon_event_config_index_get",
+ "resctrl_arch_mon_event_config_read",
+ "resctrl_arch_mon_event_config_write",
+ "l3_qos_cfg_update",
+ "l2_qos_cfg_update",
+ "set_cache_qos_cfg",
+ "rdt_domain_reconfigure_cdp",
+ "cdp_enable",
+ "cdp_disable",
+ "resctrl_arch_set_cdp_enabled",
+ "resctrl_arch_reset_all_ctrls",
+
+ # pseudo_lock_trace.h
+ "TRACE_SYSTEM",
+ "pseudo_lock_mem_latency",
+ "pseudo_lock_l2",
+ "pseudo_lock_l3",
+ "TRACE_INCLUDE_PATH",
+ "TRACE_INCLUDE_FILE",
+];
+
+functions_to_move = [
+ # common
+ "pr_fmt",
+
+ # ctrlmon.c
+ "rdt_parse_data",
+ "(ctrlval_parser_t)",
+ "bw_validate",
+ "parse_bw",
+ "cbm_validate",
+ "parse_cbm",
+ "get_parser",
+ "parse_line",
+ "rdtgroup_parse_resource",
+ "rdtgroup_schemata_write",
+ "show_doms",
+ "rdtgroup_schemata_show",
+ "smp_mon_event_count",
+ "rdtgroup_mba_mbps_event_write",
+ "rdtgroup_mba_mbps_event_show",
+ "mon_event_read",
+ "rdtgroup_mondata_show",
+
+ # internal.h
+ "MBM_OVERFLOW_INTERVAL",
+ "CQM_LIMBOCHECK_INTERVAL",
+ "cpumask_any_housekeeping",
+ "rdt_fs_context",
+ "rdt_fc2context",
+ "mon_evt",
+ "mon_data",
+ "rmid_read",
+ "resctrl_schema_all",
+ "resctrl_mounted",
+ "rdt_group_type",
+ "rdtgrp_mode",
+ "mongroup",
+ "rdtgroup",
+ "RDT_DELETED",
+ "RFTYPE_FLAGS_CPUS_LIST",
+ "RFTYPE_INFO",
+ "RFTYPE_BASE",
+ "RFTYPE_CTRL",
+ "RFTYPE_MON",
+ "RFTYPE_TOP",
+ "RFTYPE_RES_CACHE",
+ "RFTYPE_RES_MB",
+ "RFTYPE_DEBUG",
+ "RFTYPE_CTRL_INFO",
+ "RFTYPE_MON_INFO",
+ "RFTYPE_TOP_INFO",
+ "RFTYPE_CTRL_BASE",
+ "RFTYPE_MON_BASE",
+ "rdt_all_groups",
+ "rftype",
+ "mbm_state",
+ "is_mba_sc",
+
+ # monitor.c
+ "rmid_entry",
+ "rmid_free_lru",
+ "closid_num_dirty_rmid",
+ "rmid_limbo_count",
+ "rmid_ptrs",
+ "resctrl_rmid_realloc_threshold",
+ "resctrl_rmid_realloc_limit",
+ "__rmid_entry",
+ "limbo_release_entry",
+ "__check_limbo",
+ "has_busy_rmid",
+ "resctrl_find_free_rmid",
+ "resctrl_find_cleanest_closid",
+ "alloc_rmid",
+ "add_rmid_to_limbo",
+ "free_rmid",
+ "get_mbm_state",
+ "__mon_event_count",
+ "mbm_bw_count",
+ "mon_event_count",
+ "update_mba_bw",
+ "mbm_update_one_event",
+ "mbm_update",
+ "cqm_handle_limbo",
+ "cqm_setup_limbo_handler",
+ "mbm_handle_overflow",
+ "mbm_setup_overflow_handler",
+ "dom_data_init",
+ "dom_data_exit",
+ "llc_occupancy_event",
+ "mbm_total_event",
+ "mbm_local_event",
+ "l3_mon_evt_init",
+ "resctrl_mon_resource_init",
+ "resctrl_mon_resource_exit",
+
+ # pseudo_lock.c
+ "pseudo_lock_major",
+ "pseudo_lock_minor_avail",
+ "pseudo_lock_devnode",
+ "pseudo_lock_class",
+ "pseudo_lock_minor_get",
+ "pseudo_lock_minor_release",
+ "region_find_by_minor",
+ "pseudo_lock_pm_req",
+ "pseudo_lock_cstates_relax",
+ "pseudo_lock_cstates_constrain",
+ "pseudo_lock_region_clear",
+ "pseudo_lock_region_init",
+ "pseudo_lock_init",
+ "pseudo_lock_region_alloc",
+ "pseudo_lock_free",
+ "rdtgroup_monitor_in_progress",
+ "rdtgroup_locksetup_user_restrict",
+ "rdtgroup_locksetup_user_restore",
+ "rdtgroup_locksetup_enter",
+ "rdtgroup_locksetup_exit",
+ "rdtgroup_cbm_overlaps_pseudo_locked",
+ "rdtgroup_pseudo_locked_in_hierarchy",
+ "pseudo_lock_measure_cycles",
+ "pseudo_lock_measure_trigger",
+ "pseudo_measure_fops",
+ "rdtgroup_pseudo_lock_create",
+ "rdtgroup_pseudo_lock_remove",
+ "pseudo_lock_dev_open",
+ "pseudo_lock_dev_release",
+ "pseudo_lock_dev_mremap",
+ "pseudo_mmap_ops",
+ "pseudo_lock_dev_mmap",
+ "pseudo_lock_dev_fops",
+ "rdt_pseudo_lock_init",
+ "rdt_pseudo_lock_release",
+
+ # rdtgroup.c
+ "rdtgroup_mutex",
+ "rdt_root",
+ "rdtgroup_default",
+ "rdt_all_groups",
+ "resctrl_schema_all",
+ "resctrl_mounted",
+ "kn_info",
+ "kn_mongrp",
+ "kn_mondata",
+ "max_name_width",
+ "last_cmd_status",
+ "last_cmd_status_buf",
+ "rdtgroup_setup_root",
+ "rdtgroup_destroy_root",
+ "debugfs_resctrl",
+ "mba_mbps_default_event",
+ "resctrl_debug",
+ "rdt_last_cmd_clear",
+ "rdt_last_cmd_puts",
+ "rdt_last_cmd_printf",
+ "rdt_staged_configs_clear",
+ "resctrl_is_mbm_enabled",
+ "resctrl_is_mbm_event",
+ "closid_free_map",
+ "closid_free_map_len",
+ "closids_supported",
+ "closid_init",
+ "closid_exit",
+ "closid_alloc",
+ "closid_free",
+ "closid_allocated",
+ "rdtgroup_mode_by_closid",
+ "rdt_mode_str",
+ "rdtgroup_mode_str",
+ "rdtgroup_kn_set_ugid",
+ "rdtgroup_add_file",
+ "rdtgroup_seqfile_show",
+ "rdtgroup_file_write",
+ "rdtgroup_kf_single_ops",
+ "kf_mondata_ops",
+ "is_cpu_list",
+ "rdtgroup_cpus_show",
+ "update_closid_rmid",
+ "cpus_mon_write",
+ "cpumask_rdtgrp_clear",
+ "cpus_ctrl_write",
+ "rdtgroup_cpus_write",
+ "rdtgroup_remove",
+ "_update_task_closid_rmid",
+ "update_task_closid_rmid",
+ "task_in_rdtgroup",
+ "__rdtgroup_move_task",
+ "is_closid_match",
+ "is_rmid_match",
+ "rdtgroup_tasks_assigned",
+ "rdtgroup_task_write_permission",
+ "rdtgroup_move_task",
+ "rdtgroup_tasks_write",
+ "show_rdt_tasks",
+ "rdtgroup_tasks_show",
+ "rdtgroup_closid_show",
+ "rdtgroup_rmid_show",
+ "proc_resctrl_show",
+ "rdt_last_cmd_status_show",
+ "rdt_num_closids_show",
+ "rdt_default_ctrl_show",
+ "rdt_min_cbm_bits_show",
+ "rdt_shareable_bits_show",
+ "rdt_bit_usage_show",
+ "rdt_min_bw_show",
+ "rdt_num_rmids_show",
+ "rdt_mon_features_show",
+ "rdt_bw_gran_show",
+ "rdt_delay_linear_show",
+ "max_threshold_occ_show",
+ "rdt_thread_throttle_mode_show",
+ "max_threshold_occ_write",
+ "rdtgroup_mode_show",
+ "resctrl_peer_type",
+ "rdt_has_sparse_bitmasks_show",
+ "__rdtgroup_cbm_overlaps",
+ "rdtgroup_cbm_overlaps",
+ "rdtgroup_mode_test_exclusive",
+ "rdtgroup_mode_write",
+ "rdtgroup_cbm_to_size",
+ "rdtgroup_size_show",
+ "mondata_config_read",
+ "mbm_config_show",
+ "mbm_total_bytes_config_show",
+ "mbm_local_bytes_config_show",
+ "mbm_config_write_domain",
+ "mon_config_write",
+ "mbm_total_bytes_config_write",
+ "mbm_local_bytes_config_write",
+ "res_common_files",
+ "rdtgroup_add_files",
+ "rdtgroup_get_rftype_by_name",
+ "thread_throttle_mode_init",
+ "resctrl_file_fflags_init",
+ "rdtgroup_kn_mode_restrict",
+ "rdtgroup_kn_mode_restore",
+ "rdtgroup_mkdir_info_resdir",
+ "fflags_from_resource",
+ "rdtgroup_create_info_dir",
+ "mongroup_create_dir",
+ "is_mba_linear",
+ "mba_sc_domain_allocate",
+ "mba_sc_domain_destroy",
+ "supports_mba_mbps",
+ "set_mba_sc",
+ "kernfs_to_rdtgroup",
+ "rdtgroup_kn_get",
+ "rdtgroup_kn_put",
+ "rdtgroup_kn_lock_live",
+ "rdtgroup_kn_unlock",
+ "rdt_disable_ctx",
+ "rdt_enable_ctx",
+ "schemata_list_add",
+ "schemata_list_create",
+ "schemata_list_destroy",
+ "rdt_get_tree",
+ "rdt_param",
+ "rdt_fs_parameters",
+ "rdt_parse_param",
+ "rdt_fs_context_free",
+ "rdt_fs_context_ops",
+ "rdt_init_fs_context",
+ "rdt_move_group_tasks",
+ "free_all_child_rdtgrp",
+ "rmdir_all_sub",
+ "rdt_kill_sb",
+ "rdt_fs_type",
+ "mon_addfile",
+ "mon_rmdir_one_subdir",
+ "rmdir_mondata_subdir_allrdtgrp",
+ "mon_add_all_files",
+ "mkdir_mondata_subdir",
+ "mkdir_mondata_subdir_allrdtgrp",
+ "mkdir_mondata_subdir_alldom",
+ "mkdir_mondata_all",
+ "cbm_ensure_valid",
+ "__init_one_rdt_domain",
+ "rdtgroup_init_cat",
+ "rdtgroup_init_mba",
+ "rdtgroup_init_alloc",
+ "mkdir_rdt_prepare_rmid_alloc",
+ "mkdir_rdt_prepare_rmid_free",
+ "mkdir_rdt_prepare",
+ "mkdir_rdt_prepare_clean",
+ "rdtgroup_mkdir_mon",
+ "rdtgroup_mkdir_ctrl_mon",
+ "is_mon_groups",
+ "rdtgroup_mkdir",
+ "rdtgroup_rmdir_mon",
+ "rdtgroup_ctrl_remove",
+ "rdtgroup_rmdir_ctrl",
+ "rdtgroup_rmdir",
+ "mongrp_reparent",
+ "rdtgroup_rename",
+ "rdtgroup_show_options",
+ "rdtgroup_kf_syscall_ops",
+ "rdtgroup_setup_root",
+ "rdtgroup_destroy_root",
+ "rdtgroup_setup_default",
+ "domain_destroy_mon_state",
+ "resctrl_offline_ctrl_domain",
+ "resctrl_offline_mon_domain",
+ "domain_setup_mon_state",
+ "resctrl_online_ctrl_domain",
+ "resctrl_online_mon_domain",
+ "resctrl_online_cpu",
+ "clear_childcpus",
+ "resctrl_offline_cpu",
+ "resctrl_init",
+ "resctrl_exit",
+
+ # monitor_trace.h
+ "TRACE_SYSTEM",
+ "mon_llc_occupancy_limbo",
+ "TRACE_INCLUDE_PATH",
+ "TRACE_INCLUDE_FILE",
+];
+
+############
+
+builtin_non_functions = ["__setup", "__exitcall", "__printf"];
+builtin_one_arg_macros = ["LIST_HEAD", "DEFINE_MUTEX", "DEFINE_STATIC_KEY_FALSE"];
+types = ["bool", "char", "int", "u32", "long", "u64"];
+
+def get_array_name(line):
+ tok = re.search(r'([^\s]+?)\[\]', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+
+def get_struct_name(line):
+ tok = re.search(r'struct ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+def get_enum_name(line):
+ tok = re.search(r'enum ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+def get_union_name(line):
+ tok = re.search(r'union ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+
+def get_macro_name(line):
+ # #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+ tok = re.search(r'#define[\s]+([^\s]+?)\(', line)
+ if (tok):
+ return tok.group(1);
+
+ # #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
+ tok = re.search(r'#define[\s]+([^\s]+?)[\s]+.+?\n', line)
+ if (tok):
+ return tok.group(1);
+
+ return None;
+
+
+def get_macro_target(line):
+ tok = re.search(r'[^\s]+?\(([^\s]+?)\);\n', line)
+ if (tok):
+ return tok.group(1);
+
+ return None;
+
+
+# Things like 'bool my_bool;'
+def get_object_name(line):
+ # remove things that don't change the meaning of the name
+ if line.startswith("static "):
+ line = line[len("static "):];
+ if line.startswith("extern "):
+ line = line[len("extern "):];
+ if line.startswith("unsigned "):
+ line = line[len("unsigned "):];
+
+ # Note the trailing semicolon..
+ tok = re.search(r'([^\s]+)\s[\*]*([^\s\[\],;]+)', line)
+ if tok:
+ if tok.group(1) in types:
+ return tok.group(2);
+
+ tok = re.search(r'struct\s[^\s]+\s[\*]*([^\s;]+)', line)
+ if tok:
+ return tok.group(1);
+
+ tok = re.search(r'enum\s[^\s]+\s([^\s;]+)', line)
+ if tok:
+ return tok.group(1);
+
+ return None;
+
+
+# Is there a name for this block of code?
+#
+# Function names are the token before '(' ... assuming there is only one '('.
+# This also handles structs and arrays,
+def get_block_name(line):
+ # remove things that don't change the meaning of the name
+ if (" __read_mostly" in line):
+ line = line.replace(" __read_mostly", "");
+ if (" __initconst" in line):
+ line = line.replace(" __initconst", "");
+
+ if line == "enum {\n":
+ return "anonymous-enum";
+ if (line.startswith("#define")):
+ return get_macro_name(line);
+
+ if ("=" in line):
+ tok = re.search(r'[\*]*([^\s\[\]]+?)[\s\[\]]*=', line)
+ else:
+ tok = re.search(r'[\*]*([^\s]+?)\(.+?', line)
+
+ if (tok is None):
+ if ("[]" in line):
+ return get_array_name(line);
+ if (line.startswith("struct") and line.endswith("{\n")):
+ return get_struct_name(line);
+ if (line.startswith("enum") and line.endswith("{\n")):
+ return get_enum_name(line);
+ if (line.startswith("union") and line.endswith("{\n")):
+ return get_union_name(line);
+ if (line.endswith(";\n") and '(' not in line):
+ return get_object_name(line);
+ if (line.endswith("= {\n") and '(' not in line):
+ return get_object_name(line);
+ return None;
+
+ func_name = tok.group(1);
+ if (func_name in builtin_one_arg_macros):
+ tok = re.search(r'[^\(]+\(([^\s]+?)\)', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "DEFINE_PER_CPU"):
+ tok = re.search(r'DEFINE_PER_CPU\(.+?, ([^\s]+?)\)', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "TRACE_EVENT"):
+ tok = re.search(r'TRACE_EVENT\((.+?),', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "late_initcall"):
+ return get_macro_target(line);
+ else:
+ return func_name;
+
+def output_function_body(body, file):
+ # Mandatory whitespace between blocks
+ if os.lseek(file.fileno(), 0, os.SEEK_CUR) > 0:
+ file.write("\n".encode());
+
+ for out_line in body:
+ file.write(out_line.encode());
+
+# Where should we put this block of code?
+def output_function(name, body, files):
+ output = False;
+ (new_src, new_dst) = files;
+
+ if (len(body)) == 0:
+ return;
+
+ # Output to both files
+ if (name is None):
+ output_function_body(body, new_src);
+ output_function_body(body, new_dst);
+ output = True;
+ if (name in functions_to_keep):
+ output_function_body(body, new_src);
+ output = True;
+ if (name in functions_to_move):
+ output_function_body(body, new_dst);
+ output = True;
+
+ if not output:
+ print("Missing function name: "+name);
+ #print(body);
+
+def reset_parser():
+ global function_name;
+ global define_name;
+ global function_body;
+ global in_define;
+
+ function_name = None;
+ define_name = None;
+ function_body = [];
+ in_define = False;
+
+############
+
+for file in resctrl_files:
+ function_name = None;
+ # function_names take priority over defines, this is only used when
+ # no function_name was found
+ define_name = None;
+ function_body = [];
+ # Nothing clever - this is just to detect newlines between functions
+ in_function = False;
+ in_define = False;
+
+ src_path = SRC_DIR + "/" + str(file);
+ if (not os.path.isfile(src_path)):
+ continue;
+ dst_path = DST_DIR + "/" + str(file);
+
+ orig_file = open(src_path, "r");
+ lines = orig_file.readlines();
+
+ # Now unlink the original file, so it can be re-created with new
+ # contents.
+ try:
+ os.unlink(src_path);
+ except Exception as err:
+ print("Failed to unlink source file: {err}");
+ sys.exit(1);
+
+ # non-buffering is so we can snoop the fd offset to avoid trailing newlines
+ new_src = open(src_path, "wb", buffering=0);
+ new_dst = open(dst_path, "wb", buffering=0);
+
+ for line in lines:
+ # Empty lines outside a function - reset the function tracking
+ if (line == "\n" and not in_function):
+ if function_name is None and define_name is not None:
+ function_name = define_name;
+ output_function(function_name, function_body, (new_src, new_dst));
+ reset_parser();
+
+ # Function prototypes are a funny C thing - reset the function tracking
+ elif (line[0].isspace() and not in_function and line.endswith(");\n")):
+ function_body += [line];
+ output_function(function_name, function_body, (new_src, new_dst));
+ reset_parser();
+
+ # Lines that begin with whitespace are part of the current function.
+ elif (line[0].isspace()):
+ function_body += [line];
+
+ # Next, try to find the kind of line that contains a function name
+
+ # Ignore lines with comment markers, braces
+ elif (line.startswith("/*")):
+ function_body += [line];
+ elif (line.startswith("*/")):
+ function_body += [line];
+ elif (line.startswith("//")):
+ function_body += [line];
+ elif (line == "{\n"):
+ function_body += [line];
+ in_function = True;
+ elif (line == "}\n"):
+ function_body += [line];
+ in_function = False;
+ elif (line == "};\n"):
+ function_body += [line];
+ in_function = False;
+
+ elif (line.startswith("#include")):
+ function_body += [line];
+ elif (line.startswith("#if ")):
+ function_body += [line];
+ elif (line.startswith("#ifdef ")):
+ function_body += [line];
+ elif (line.startswith("#ifndef ")):
+ function_body += [line];
+ elif (line.startswith("#else")):
+ function_body += [line];
+ elif (line.startswith("#endif")):
+ function_body += [line];
+ elif (line.startswith("#undef ")):
+ function_body += [line];
+ elif (line.startswith("#define")):
+ function_body += [line];
+ define_name = get_block_name(line);
+
+ # Multi-line define?
+ if line.endswith("\\\n"):
+ in_define = True;
+ else:
+ output_function(define_name, function_body, (new_src, new_dst));
+ reset_parser();
+ elif in_define and line.endswith("\\\n"):
+ function_body += [line];
+
+ # goto was always a crime
+ elif (' ' not in line and line.endswith(":\n")):
+ function_body += [line];
+
+ # Try and parse a function/array name
+
+ # Things like late_initcall() aren't function names, but belong to
+ # the previous function.
+ elif (get_block_name(line) in builtin_non_functions):
+ function_body += [line];
+
+ # Start a new block if we can get a block name for this line
+ elif (get_block_name(line) != None and function_name is None):
+ _name = get_block_name(line);
+
+ if (line.endswith("{\n")):
+ in_function = True;
+
+ # Is this a function prototype? Output it now
+ if (line.endswith(";\n")):
+ function_body += [line];
+ output_function(_name, function_body, (new_src, new_dst));
+ reset_parser();
+ else:
+ function_name = _name;
+ function_body += [line];
+
+ # Failed to parse a function name ... did it get split up?
+ elif (line.startswith("static")):
+ function_body += [line];
+
+ else:
+ print("Unknown: '" + line + "'");
+
+ # Output whatever is left in the buffer
+ output_function(function_name, function_body, (new_src, new_dst));
+
+ orig_file.close();
--
2.39.2
^ permalink raw reply related [flat|nested] 135+ messages in thread* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-07 18:18 ` [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
@ 2025-02-20 6:10 ` Reinette Chatre
2025-02-25 16:16 ` Reinette Chatre
2025-02-25 5:02 ` Fenghua Yu
1 sibling, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-20 6:10 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/7/25 10:18 AM, James Morse wrote:
> To support more than one architecture resctrl needs to move from arch/x86
> to live under fs. Moving all the code breaks any series on the mailing
> list, so needs scheduling carefully.
>
> Maintaining the patch that moves all this code has proved labour intensive.
> It's also near-impossible to review that no inadvertent changes have
> crept in.
>
> To solve these problems, temporarily add a hacky python program that
> lists all the functions that should move, and those that should stay.
>
> No attempt to parse C code is made, this thing tries to name 'blocks'
> based on hueristics about the kernel coding style. It's fragile, but
(heuristics)
> good enough for its single use here.
>
> This only exists to show I have nothing up my sleeve.
> I don't suggested this gets merged.
>
> The patch this script generaets has the following corner cases:
(generates)
> * The original files are regenerated, which will add newlines that are
> not present in the original file.
> * An trace-point header file the only contains boiler-plate is created
> in the arch and filesystem code. The parser doesn't know how to remove
> the includes for these - but its easy to 'keep' the file contents on
> the correct side. A follow-up patch will remove these files and their
> includes.
Related to the tracepoints I also noticed that there are some leftover
tracing defines in files that no longer make use of tracing.
For example, arch/x86/kernel/cpu/resctrl/monitor.c contains:
#define CREATE_TRACE_POINTS
#include monitor_trace.h
and fs/resctrl/pseudo_lock.c contains:
#define CREATE_TRACE_POINTS
#include "pseudo_lock_trace.h"
I assumed this will also be removed in this follow-up patch?
> * asm/cpu_device_id.h and a relative path for 'X86_CONFIG()' are kept
> in the filesystem code to ensure x86 builds. A follow-up patch will
> remove these.
> * This script doesn't know how to move the documentation, and update the
> links in comments. A follow-up patch does this.
One unexpected thing I noticed is that fs/resctr/internal.h contains:
#ifndef _ASM_X86_RESCTRL_INTERNAL_H
#define _ASM_X86_RESCTRL_INTERNAL_H
...
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-20 6:10 ` Reinette Chatre
@ 2025-02-25 16:16 ` Reinette Chatre
2025-02-28 19:57 ` James Morse
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-25 16:16 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi James,
On 2/19/25 10:10 PM, Reinette Chatre wrote:
> Hi James,
>
> On 2/7/25 10:18 AM, James Morse wrote:
>> To support more than one architecture resctrl needs to move from arch/x86
>> to live under fs. Moving all the code breaks any series on the mailing
>> list, so needs scheduling carefully.
>>
>> Maintaining the patch that moves all this code has proved labour intensive.
>> It's also near-impossible to review that no inadvertent changes have
>> crept in.
>>
>> To solve these problems, temporarily add a hacky python program that
>> lists all the functions that should move, and those that should stay.
>>
>> No attempt to parse C code is made, this thing tries to name 'blocks'
>> based on hueristics about the kernel coding style. It's fragile, but
>
> (heuristics)
>
>> good enough for its single use here.
>>
>> This only exists to show I have nothing up my sleeve.
>> I don't suggested this gets merged.
>>
>> The patch this script generaets has the following corner cases:
> (generates)
>
>> * The original files are regenerated, which will add newlines that are
>> not present in the original file.
>> * An trace-point header file the only contains boiler-plate is created
>> in the arch and filesystem code. The parser doesn't know how to remove
>> the includes for these - but its easy to 'keep' the file contents on
>> the correct side. A follow-up patch will remove these files and their
>> includes.
>
> Related to the tracepoints I also noticed that there are some leftover
> tracing defines in files that no longer make use of tracing.
> For example, arch/x86/kernel/cpu/resctrl/monitor.c contains:
> #define CREATE_TRACE_POINTS
> #include monitor_trace.h
>
> and fs/resctrl/pseudo_lock.c contains:
> #define CREATE_TRACE_POINTS
> #include "pseudo_lock_trace.h"
>
> I assumed this will also be removed in this follow-up patch?
>
>> * asm/cpu_device_id.h and a relative path for 'X86_CONFIG()' are kept
>> in the filesystem code to ensure x86 builds. A follow-up patch will
>> remove these.
>> * This script doesn't know how to move the documentation, and update the
>> links in comments. A follow-up patch does this.
>
> One unexpected thing I noticed is that fs/resctr/internal.h contains:
> #ifndef _ASM_X86_RESCTRL_INTERNAL_H
> #define _ASM_X86_RESCTRL_INTERNAL_H
> ...
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
It looks like another item for this list of "corner cases" is that the
#include of all files need to be reviewed after the code move. There are
functions depending on a particular #include that are moved out of the .c
file but the (no longer needed) #include remains.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-25 16:16 ` Reinette Chatre
@ 2025-02-28 19:57 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:57 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni, Shaopeng Tan,
Tony Luck
Hi Reinette,
On 25/02/2025 16:16, Reinette Chatre wrote:
> On 2/19/25 10:10 PM, Reinette Chatre wrote:
>> On 2/7/25 10:18 AM, James Morse wrote:
>>> To support more than one architecture resctrl needs to move from arch/x86
>>> to live under fs. Moving all the code breaks any series on the mailing
>>> list, so needs scheduling carefully.
>>>
>>> Maintaining the patch that moves all this code has proved labour intensive.
>>> It's also near-impossible to review that no inadvertent changes have
>>> crept in.
>>>
>>> To solve these problems, temporarily add a hacky python program that
>>> lists all the functions that should move, and those that should stay.
>>>
>>> No attempt to parse C code is made, this thing tries to name 'blocks'
>>> based on hueristics about the kernel coding style. It's fragile, but
>>
>> (heuristics)
>>
>>> good enough for its single use here.
>>>
>>> This only exists to show I have nothing up my sleeve.
>>> I don't suggested this gets merged.
>>>
>>> The patch this script generaets has the following corner cases:
>> (generates)
>>
>>> * The original files are regenerated, which will add newlines that are
>>> not present in the original file.
>>> * An trace-point header file the only contains boiler-plate is created
>>> in the arch and filesystem code. The parser doesn't know how to remove
>>> the includes for these - but its easy to 'keep' the file contents on
>>> the correct side. A follow-up patch will remove these files and their
>>> includes.
>>
>> Related to the tracepoints I also noticed that there are some leftover
>> tracing defines in files that no longer make use of tracing.
>> For example, arch/x86/kernel/cpu/resctrl/monitor.c contains:
>> #define CREATE_TRACE_POINTS
>> #include monitor_trace.h
>>
>> and fs/resctrl/pseudo_lock.c contains:
>> #define CREATE_TRACE_POINTS
>> #include "pseudo_lock_trace.h"
>>
>> I assumed this will also be removed in this follow-up patch?
Yup:
https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.14-rc1&id=3d0430324a0c7e7ad765140f9e78a9a312a13573
I'll include this patch in v7, you found a case where its not as harmless as I thought.
>>> * asm/cpu_device_id.h and a relative path for 'X86_CONFIG()' are kept
>>> in the filesystem code to ensure x86 builds. A follow-up patch will
>>> remove these.
>>> * This script doesn't know how to move the documentation, and update the
>>> links in comments. A follow-up patch does this.
>>
>> One unexpected thing I noticed is that fs/resctr/internal.h contains:
>> #ifndef _ASM_X86_RESCTRL_INTERNAL_H
>> #define _ASM_X86_RESCTRL_INTERNAL_H
>> ...
>> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
That's a new one - I'll add a follow-up patch to change those.
> It looks like another item for this list of "corner cases" is that the
> #include of all files need to be reviewed after the code move. There are
> functions depending on a particular #include that are moved out of the .c
> file but the (no longer needed) #include remains.
Indeed, that is one of the followups that I'll include in v7.
https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.14-rc1&id=9e2bd53f5e2b33fef69db1aae2dd7aeeaf1dd24c
I suggest all these get merged into the patch that moves the code - but I'll post them
separately in case anyone is interested in regenerating the patch using this script.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-07 18:18 ` [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
2025-02-20 6:10 ` Reinette Chatre
@ 2025-02-25 5:02 ` Fenghua Yu
2025-02-28 19:57 ` James Morse
2025-02-28 20:06 ` Moger, Babu
1 sibling, 2 replies; 135+ messages in thread
From: Fenghua Yu @ 2025-02-25 5:02 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi, James and Reinette,
On 2/7/25 10:18, James Morse wrote:
> To support more than one architecture resctrl needs to move from arch/x86
> to live under fs. Moving all the code breaks any series on the mailing
> list, so needs scheduling carefully.
>
> Maintaining the patch that moves all this code has proved labour intensive.
> It's also near-impossible to review that no inadvertent changes have
> crept in.
>
> To solve these problems, temporarily add a hacky python program that
> lists all the functions that should move, and those that should stay.
>
> No attempt to parse C code is made, this thing tries to name 'blocks'
> based on hueristics about the kernel coding style. It's fragile, but
> good enough for its single use here.
>
> This only exists to show I have nothing up my sleeve.
> I don't suggested this gets merged.
>
> The patch this script generaets has the following corner cases:
> * The original files are regenerated, which will add newlines that are
> not present in the original file.
> * An trace-point header file the only contains boiler-plate is created
> in the arch and filesystem code. The parser doesn't know how to remove
> the includes for these - but its easy to 'keep' the file contents on
> the correct side. A follow-up patch will remove these files and their
> includes.
Due to no trace event defined in the _trace.h files, compilation errors
are reported when building kernel by W=1.
This patch seems the "follow-up" patch mentioned here? After this patch
is applied, no more errors reported when W=1.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
b/arch/x86/kernel/cpu/resctrl/monitor.c
index 1809e3fe6ef3..800e52845b1d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -27,10 +27,6 @@
#include "internal.h"
-#define CREATE_TRACE_POINTS
-
-#include "monitor_trace.h"
-
/*
* Global boolean for rdt_monitor which is true if any
* resource monitoring is enabled.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor_trace.h
b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
deleted file mode 100644
index b5a142dd0f0e..000000000000
--- a/arch/x86/kernel/cpu/resctrl/monitor_trace.h
+++ /dev/null
@@ -1,17 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#undef TRACE_SYSTEM
-#define TRACE_SYSTEM resctrl
-
-#if !defined(_FS_RESCTRL_MONITOR_TRACE_H) ||
defined(TRACE_HEADER_MULTI_READ)
-#define _FS_RESCTRL_MONITOR_TRACE_H
-
-#include <linux/tracepoint.h>
-
-#endif /* _FS_RESCTRL_MONITOR_TRACE_H */
-
-#undef TRACE_INCLUDE_PATH
-#define TRACE_INCLUDE_PATH .
-
-#define TRACE_INCLUDE_FILE monitor_trace
-
-#include <trace/define_trace.h>
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
index d8389779835d..6c49dd60174f 100644
--- a/fs/resctrl/pseudo_lock.c
+++ b/fs/resctrl/pseudo_lock.c
@@ -29,10 +29,6 @@
#include "../../events/perf_event.h" /* For X86_CONFIG() */
#include "internal.h"
-#define CREATE_TRACE_POINTS
-
-#include "pseudo_lock_trace.h"
-
/*
* Major number assigned to and shared by all devices exposing
* pseudo-locked regions.
diff --git a/fs/resctrl/pseudo_lock_trace.h b/fs/resctrl/pseudo_lock_trace.h
deleted file mode 100644
index 7a6a1983953a..000000000000
--- a/fs/resctrl/pseudo_lock_trace.h
+++ /dev/null
@@ -1,17 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#undef TRACE_SYSTEM
-#define TRACE_SYSTEM resctrl
-
-#if !defined(_X86_RESCTRL_PSEUDO_LOCK_TRACE_H) ||
defined(TRACE_HEADER_MULTI_READ)
-#define _X86_RESCTRL_PSEUDO_LOCK_TRACE_H
-
-#include <linux/tracepoint.h>
-
-#endif /* _X86_RESCTRL_PSEUDO_LOCK_TRACE_H */
-
-#undef TRACE_INCLUDE_PATH
-#define TRACE_INCLUDE_PATH .
-
-#define TRACE_INCLUDE_FILE pseudo_lock_trace
-
-#include <trace/define_trace.h>
^ permalink raw reply related [flat|nested] 135+ messages in thread
* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-25 5:02 ` Fenghua Yu
@ 2025-02-28 19:57 ` James Morse
2025-02-28 20:06 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:57 UTC (permalink / raw)
To: Fenghua Yu, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi Fenghua,
On 25/02/2025 05:02, Fenghua Yu wrote:
> On 2/7/25 10:18, James Morse wrote:
>> To support more than one architecture resctrl needs to move from arch/x86
>> to live under fs. Moving all the code breaks any series on the mailing
>> list, so needs scheduling carefully.
>>
>> Maintaining the patch that moves all this code has proved labour intensive.
>> It's also near-impossible to review that no inadvertent changes have
>> crept in.
>>
>> To solve these problems, temporarily add a hacky python program that
>> lists all the functions that should move, and those that should stay.
>>
>> No attempt to parse C code is made, this thing tries to name 'blocks'
>> based on hueristics about the kernel coding style. It's fragile, but
>> good enough for its single use here.
>>
>> This only exists to show I have nothing up my sleeve.
>> I don't suggested this gets merged.
>>
>> The patch this script generaets has the following corner cases:
>> * The original files are regenerated, which will add newlines that are
>> not present in the original file.
>> * An trace-point header file the only contains boiler-plate is created
>> in the arch and filesystem code. The parser doesn't know how to remove
>> the includes for these - but its easy to 'keep' the file contents on
>> the correct side. A follow-up patch will remove these files and their
>> includes.
>
> Due to no trace event defined in the _trace.h files, compilation errors are reported when
> building kernel by W=1.
>
> This patch seems the "follow-up" patch mentioned here? After this patch is applied, no
> more errors reported when W=1.
Yup, the follow up is here:
https://web.git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.14-rc1&id=3d0430324a0c7e7ad765140f9e78a9a312a13573
I thought this was harmless, but evidently kbuild doesn't like it.
I'll include all these in v7 - I suggest they get squashed into the generated patch once
they're reviewed.
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2025-02-25 5:02 ` Fenghua Yu
2025-02-28 19:57 ` James Morse
@ 2025-02-28 20:06 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: Moger, Babu @ 2025-02-28 20:06 UTC (permalink / raw)
To: Fenghua Yu, James Morse, x86, linux-kernel
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni, Shaopeng Tan, Tony Luck
Hi All,
On 2/24/2025 11:02 PM, Fenghua Yu wrote:
> Hi, James and Reinette,
>
> On 2/7/25 10:18, James Morse wrote:
>> To support more than one architecture resctrl needs to move from arch/x86
>> to live under fs. Moving all the code breaks any series on the mailing
>> list, so needs scheduling carefully.
>>
>> Maintaining the patch that moves all this code has proved labour
>> intensive.
>> It's also near-impossible to review that no inadvertent changes have
>> crept in.
>>
>> To solve these problems, temporarily add a hacky python program that
>> lists all the functions that should move, and those that should stay.
>>
>> No attempt to parse C code is made, this thing tries to name 'blocks'
>> based on hueristics about the kernel coding style. It's fragile, but
>> good enough for its single use here.
>>
>> This only exists to show I have nothing up my sleeve.
>> I don't suggested this gets merged.
>>
>> The patch this script generaets has the following corner cases:
>> * The original files are regenerated, which will add newlines that are
>> not present in the original file.
>> * An trace-point header file the only contains boiler-plate is created
>> in the arch and filesystem code. The parser doesn't know how to remove
>> the includes for these - but its easy to 'keep' the file contents on
>> the correct side. A follow-up patch will remove these files and their
>> includes.
>
> Due to no trace event defined in the _trace.h files, compilation errors
> are reported when building kernel by W=1.
>
> This patch seems the "follow-up" patch mentioned here? After this patch
> is applied, no more errors reported when W=1.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/
> cpu/resctrl/monitor.c
> index 1809e3fe6ef3..800e52845b1d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -27,10 +27,6 @@
>
> #include "internal.h"
>
> -#define CREATE_TRACE_POINTS
> -
> -#include "monitor_trace.h"
> -
> /*
> * Global boolean for rdt_monitor which is true if any
> * resource monitoring is enabled.
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor_trace.h b/arch/x86/
> kernel/cpu/resctrl/monitor_trace.h
> deleted file mode 100644
> index b5a142dd0f0e..000000000000
> --- a/arch/x86/kernel/cpu/resctrl/monitor_trace.h
> +++ /dev/null
> @@ -1,17 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#undef TRACE_SYSTEM
> -#define TRACE_SYSTEM resctrl
> -
> -#if !defined(_FS_RESCTRL_MONITOR_TRACE_H) ||
> defined(TRACE_HEADER_MULTI_READ)
> -#define _FS_RESCTRL_MONITOR_TRACE_H
> -
> -#include <linux/tracepoint.h>
> -
> -#endif /* _FS_RESCTRL_MONITOR_TRACE_H */
> -
> -#undef TRACE_INCLUDE_PATH
> -#define TRACE_INCLUDE_PATH .
> -
> -#define TRACE_INCLUDE_FILE monitor_trace
> -
> -#include <trace/define_trace.h>
> diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
> index d8389779835d..6c49dd60174f 100644
> --- a/fs/resctrl/pseudo_lock.c
> +++ b/fs/resctrl/pseudo_lock.c
> @@ -29,10 +29,6 @@
> #include "../../events/perf_event.h" /* For X86_CONFIG() */
> #include "internal.h"
>
> -#define CREATE_TRACE_POINTS
> -
> -#include "pseudo_lock_trace.h"
> -
> /*
> * Major number assigned to and shared by all devices exposing
> * pseudo-locked regions.
> diff --git a/fs/resctrl/pseudo_lock_trace.h b/fs/resctrl/
> pseudo_lock_trace.h
> deleted file mode 100644
> index 7a6a1983953a..000000000000
> --- a/fs/resctrl/pseudo_lock_trace.h
> +++ /dev/null
> @@ -1,17 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#undef TRACE_SYSTEM
> -#define TRACE_SYSTEM resctrl
> -
> -#if !defined(_X86_RESCTRL_PSEUDO_LOCK_TRACE_H) ||
> defined(TRACE_HEADER_MULTI_READ)
> -#define _X86_RESCTRL_PSEUDO_LOCK_TRACE_H
> -
> -#include <linux/tracepoint.h>
> -
> -#endif /* _X86_RESCTRL_PSEUDO_LOCK_TRACE_H */
> -
> -#undef TRACE_INCLUDE_PATH
> -#define TRACE_INCLUDE_PATH .
> -
> -#define TRACE_INCLUDE_FILE pseudo_lock_trace
> -
> -#include <trace/define_trace.h>
>
>
Just to confirm. I had the same build issues. This patch resolved the
problem. Booted the system and ran basic resctrl tests and everything
worked as expected.
Thanks
Babu
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (41 preceding siblings ...)
2025-02-07 18:18 ` [PATCH v6 42/42] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
@ 2025-02-10 17:24 ` Reinette Chatre
2025-02-11 14:36 ` Peter Newman
2025-02-11 18:37 ` James Morse
2025-02-28 1:15 ` Shaopeng Tan (Fujitsu)
43 siblings, 2 replies; 135+ messages in thread
From: Reinette Chatre @ 2025-02-10 17:24 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
I'd like to check in on what you said in [1]. It sounded as though you were
planning to look at the assignable counter work from an Arm/MPAM
perspective but that work has since progressed (now at V11 [2]) without
input from Arm/MPAM perspective. As I understand assignable counters may benefit
MPAM and looking close to settled but it is difficult to gain confidence
in an interface that may (may not?) be used for MPAM without any feedback
from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
this new interface and find it confusing that there does not seem to be
any input from MPAM side. What am I missing?
Reinette
[1] https://lore.kernel.org/lkml/9479c799-86fc-4d9e-addb-66011ecae9c7@arm.com/
[2] https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-10 17:24 ` [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
@ 2025-02-11 14:36 ` Peter Newman
2025-02-11 18:37 ` James Morse
2025-02-12 15:24 ` Moger, Babu
2025-02-11 18:37 ` James Morse
1 sibling, 2 replies; 135+ messages in thread
From: Peter Newman @ 2025-02-11 14:36 UTC (permalink / raw)
To: Reinette Chatre
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
Hi Reinette,
On Mon, Feb 10, 2025 at 6:24 PM Reinette Chatre
<reinette.chatre@intel.com> wrote:
>
> Hi James,
>
> I'd like to check in on what you said in [1]. It sounded as though you were
> planning to look at the assignable counter work from an Arm/MPAM
> perspective but that work has since progressed (now at V11 [2]) without
> input from Arm/MPAM perspective. As I understand assignable counters may benefit
> MPAM and looking close to settled but it is difficult to gain confidence
> in an interface that may (may not?) be used for MPAM without any feedback
> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
> this new interface and find it confusing that there does not seem to be
> any input from MPAM side. What am I missing?
I've looked into monitor assignment on MPAM a little, so I'll share my findings.
Like with ABMC/BMEC, MPAM's counters can be configured to monitor
reads, writes, or both, so there are situations where it would be
useful to be able to assign 2 counters to the same group to be able to
break down the bandwidth between reads and writes. However, a group's
two assignment slots are called "local" and "total", so if MPAM's
resources only support one of the two, then only one counter can be
assigned to a group.
MPAM does not support any filters that would differentiate between
traffic serviced by local or remote memory, so it's difficult to see
an MBM event other than "total" ever being used. Multiple MSCs
measuring memory bandwidth at an interconnect and a local memory
controller could potentially be used to together to infer the "local"
and "total" counts, but this would require the implementation to
understand the platform-specific relationship between different types
of MSCs and somehow present them as a single rdt_resource to resctrl.
As best as I can tell, the MPAM driver today will choose "local" or
"total"[1] for what it will present to the FS layer as an
rdt_resource.
Based on this, I would prefer the arch/fs refactoring changes go in
first to give us more time to think about how better to abstract
counter assignment on a non-RDTlike implementation. I believe finally
settling on an arch/fs separation for the currently-supported feature
set would make the counter assignment work clearer for everyone
involved. Also, my own users have been using an implementation like
this one successfully for over a year on ARM-based platforms while I'm
still just experimenting with the usage model of ABMC on AMD hardware,
so I consider the MPAM work to be more mature and would not like to
see it delayed on account of ABMC.
Thanks!
-Peter
[1] https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers/platform/arm64/mpam/mpam_resctrl.c?h=mpam/snapshot/v6.14-rc1#n824
>
> Reinette
>
> [1] https://lore.kernel.org/lkml/9479c799-86fc-4d9e-addb-66011ecae9c7@arm.com/
> [2] https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
>
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-11 14:36 ` Peter Newman
@ 2025-02-11 18:37 ` James Morse
2025-02-12 15:24 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-11 18:37 UTC (permalink / raw)
To: Peter Newman, Reinette Chatre
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni
Hi Peter,
On 11/02/2025 14:36, Peter Newman wrote:
> On Mon, Feb 10, 2025 at 6:24 PM Reinette Chatre
> <reinette.chatre@intel.com> wrote:
>> I'd like to check in on what you said in [1]. It sounded as though you were
>> planning to look at the assignable counter work from an Arm/MPAM
>> perspective but that work has since progressed (now at V11 [2]) without
>> input from Arm/MPAM perspective. As I understand assignable counters may benefit
>> MPAM and looking close to settled but it is difficult to gain confidence
>> in an interface that may (may not?) be used for MPAM without any feedback
>> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
>> this new interface and find it confusing that there does not seem to be
>> any input from MPAM side. What am I missing?
>
> I've looked into monitor assignment on MPAM a little, so I'll share my findings.
>
> Like with ABMC/BMEC, MPAM's counters can be configured to monitor
> reads, writes, or both, so there are situations where it would be
> useful to be able to assign 2 counters to the same group to be able to
> break down the bandwidth between reads and writes. However, a group's
> two assignment slots are called "local" and "total", so if MPAM's
> resources only support one of the two, then only one counter can be
> assigned to a group.
Wouldn't this be a problem on AMD too?
... specifically 2 counters with different configurations to the same group ...
I suspect it may be simpler to support complex things like that via perf.
I'd dropped that in favour of ABMC, but one platform has come out of the woodwork where
there are only monitors on the L2 - and I don't think we should expose new counter files
via resctrl...
> MPAM does not support any filters that would differentiate between
> traffic serviced by local or remote memory, so it's difficult to see
> an MBM event other than "total" ever being used.
The driver guesses from the topology! If the counters used are on the L3, chances are they
are local to a NUMA node. If they're on the memory controller, its probably total.
That code does need tightening up to check the cache boundaries match the numa boundaries
- but I haven't found a machine to test the bandwidth counters on at all yet.
I don't see how this would change what resctrl exposes - mbm_local and mbm_total already
exist. It's up to the MPAM driver to best match what it has with what it can exposed to
user-space...
> Multiple MSCs
> measuring memory bandwidth at an interconnect and a local memory
> controller could potentially be used to together to infer the "local"
> and "total" counts, but this would require the implementation to
> understand the platform-specific relationship between different types
> of MSCs and somehow present them as a single rdt_resource to resctrl.
> As best as I can tell, the MPAM driver today will choose "local" or
> "total"[1] for what it will present to the FS layer as an
> rdt_resource.
I think 'both' should fall out of that logic. It should keep moving the 'total' bandwidth
counter down the hierarchy until it reaches the memory controller.
I'd expect a platform that looks like this to have bandwidth monitors on the L3 (or
whatever cache matches the NUMA boundary) and bandwidth monitors on the memory controller.
Having two sets of bandwidth counters that measure different things in the same MSC is not
something that can be described by the firmware tables. (I did ask)
I think the logic here would be contained to the MPAM driver...
Thanks,
James
> Based on this, I would prefer the arch/fs refactoring changes go in
> first to give us more time to think about how better to abstract
> counter assignment on a non-RDTlike implementation. I believe finally
> settling on an arch/fs separation for the currently-supported feature
> set would make the counter assignment work clearer for everyone
> involved. Also, my own users have been using an implementation like
> this one successfully for over a year on ARM-based platforms while I'm
> still just experimenting with the usage model of ABMC on AMD hardware,
> so I consider the MPAM work to be more mature and would not like to
> see it delayed on account of ABMC.
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-11 14:36 ` Peter Newman
2025-02-11 18:37 ` James Morse
@ 2025-02-12 15:24 ` Moger, Babu
1 sibling, 0 replies; 135+ messages in thread
From: Moger, Babu @ 2025-02-12 15:24 UTC (permalink / raw)
To: Peter Newman, Reinette Chatre, James Morse
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Shanker Donthineni
Hi Peter/Reinette/James,
On 2/11/25 08:36, Peter Newman wrote:
> Hi Reinette,
>
> On Mon, Feb 10, 2025 at 6:24 PM Reinette Chatre
> <reinette.chatre@intel.com> wrote:
>>
>> Hi James,
>>
>> I'd like to check in on what you said in [1]. It sounded as though you were
>> planning to look at the assignable counter work from an Arm/MPAM
>> perspective but that work has since progressed (now at V11 [2]) without
>> input from Arm/MPAM perspective. As I understand assignable counters may benefit
>> MPAM and looking close to settled but it is difficult to gain confidence
>> in an interface that may (may not?) be used for MPAM without any feedback
>> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
>> this new interface and find it confusing that there does not seem to be
>> any input from MPAM side. What am I missing?
>
> I've looked into monitor assignment on MPAM a little, so I'll share my findings.
Thanks.
>
> Like with ABMC/BMEC, MPAM's counters can be configured to monitor
> reads, writes, or both, so there are situations where it would be
> useful to be able to assign 2 counters to the same group to be able to
> break down the bandwidth between reads and writes. However, a group's
> two assignment slots are called "local" and "total", so if MPAM's
> resources only support one of the two, then only one counter can be
> assigned to a group.
This can be done with current ABMC interface. Only one counter can be
assigned to the group.
>
> MPAM does not support any filters that would differentiate between
> traffic serviced by local or remote memory, so it's difficult to see
> an MBM event other than "total" ever being used. Multiple MSCs
> measuring memory bandwidth at an interconnect and a local memory
> controller could potentially be used to together to infer the "local"
> and "total" counts, but this would require the implementation to
> understand the platform-specific relationship between different types
> of MSCs and somehow present them as a single rdt_resource to resctrl.
> As best as I can tell, the MPAM driver today will choose "local" or
> "total"[1] for what it will present to the FS layer as an
> rdt_resource.
That is still fine.
>
> Based on this, I would prefer the arch/fs refactoring changes go in
> first to give us more time to think about how better to abstract
> counter assignment on a non-RDTlike implementation. I believe finally
I don't believe this series is ready for merging yet. It still needs to go
through the review process and a few more revisions. Based on our past
experience, the turnaround time from ARM has not been great, which will
likely delay this series by six to eight months.
> settling on an arch/fs separation for the currently-supported feature
> set would make the counter assignment work clearer for everyone
> involved. Also, my own users have been using an implementation like
> this one successfully for over a year on ARM-based platforms while I'm
> still just experimenting with the usage model of ABMC on AMD hardware,
> so I consider the MPAM work to be more mature and would not like to
> see it delayed on account of ABMC.
We've been working on ABMC for the past year, and it's almost ready for
merging. Now we have to wait? For how long?
On the higher level ABMC and assignment in MPAM looks similar. We added
the assignment interface, assuming it would be easily adapted to MPAM.
We've incorporated all the requested changes—at least from Peter—but
haven't received much feedback from ARM.
James, could you take some time to review the interface and see if it can
be easily adapted for MPAM?
https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
I was planning to post the next version this week, but I can wait for
feedback.
>
> Thanks!
> -Peter
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers/platform/arm64/mpam/mpam_resctrl.c?h=mpam/snapshot/v6.14-rc1#n824
>
>>
>> Reinette
>>
>> [1] https://lore.kernel.org/lkml/9479c799-86fc-4d9e-addb-66011ecae9c7@arm.com/
>> [2] https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
>>
>
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-10 17:24 ` [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
2025-02-11 14:36 ` Peter Newman
@ 2025-02-11 18:37 ` James Morse
2025-02-11 19:29 ` Reinette Chatre
1 sibling, 1 reply; 135+ messages in thread
From: James Morse @ 2025-02-11 18:37 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi Reinette,
On 10/02/2025 17:24, Reinette Chatre wrote:
> I'd like to check in on what you said in [1]. It sounded as though you were
> planning to look at the assignable counter work from an Arm/MPAM
> perspective but that work has since progressed (now at V11 [2]) without
> input from Arm/MPAM perspective. As I understand assignable counters may benefit
> MPAM and looking close to settled but it is difficult to gain confidence
> in an interface that may (may not?) be used for MPAM without any feedback
> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
> this new interface and find it confusing that there does not seem to be
> any input from MPAM side. What am I missing?
Shortly after that some 'new' Spectre issue turned up - unfortunately those rapidly
consume all the time available, and predicting them is, er, the nature of the problem.
This is still on my todo list, I think Dave is planning to look through the ABMC series
too. I had previously sent some comments, (words to the effect "works for me"), and shared
a branch with the MPAM tree rebased on top.
I'm part way though rebasing the MPAM tree on top of Babu's latest version, and still hope
to give some feedback based on testing it...
Thanks,
James
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-11 18:37 ` James Morse
@ 2025-02-11 19:29 ` Reinette Chatre
2025-02-12 16:04 ` Dave Martin
0 siblings, 1 reply; 135+ messages in thread
From: Reinette Chatre @ 2025-02-11 19:29 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
Babu Moger, shameerali.kolothum.thodi, D Scott Phillips OS, carl,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Shanker Donthineni
Hi James,
On 2/11/25 10:37 AM, James Morse wrote:
> Hi Reinette,
>
> On 10/02/2025 17:24, Reinette Chatre wrote:
>> I'd like to check in on what you said in [1]. It sounded as though you were
>> planning to look at the assignable counter work from an Arm/MPAM
>> perspective but that work has since progressed (now at V11 [2]) without
>> input from Arm/MPAM perspective. As I understand assignable counters may benefit
>> MPAM and looking close to settled but it is difficult to gain confidence
>> in an interface that may (may not?) be used for MPAM without any feedback
>> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
>> this new interface and find it confusing that there does not seem to be
>> any input from MPAM side. What am I missing?
>
> Shortly after that some 'new' Spectre issue turned up - unfortunately those rapidly
> consume all the time available, and predicting them is, er, the nature of the problem.
I understand how these things go. At the same time the new version you sent hinted to
me that you were able to come up for air to give some attention to resctrl upstream and
I grasped the opportunity to get your input on this item that moved along while you were
busy elsewhere.
>
> This is still on my todo list, I think Dave is planning to look through the ABMC series
> too. I had previously sent some comments, (words to the effect "works for me"), and shared
> a branch with the MPAM tree rebased on top.
To me the primary concern is the new resctrl files introduced by this work. To get an idea
of the work and the new files you need only consider Babu's detailed cover letter. These files
have changed since your previous comments so a fresh "works for me" will be appreciated.
>
> I'm part way though rebasing the MPAM tree on top of Babu's latest version, and still hope
> to give some feedback based on testing it...
We tried to separate the arch and fs code logically but I am sure that it can be improved once
another architecture actually tries to use it. To me these internals seem a problem that
can be solved after merge of the assignable counter work, but addressing issues with the new
resctrl fs user interface after merge will be hard.
Reinette
^ permalink raw reply [flat|nested] 135+ messages in thread
* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-11 19:29 ` Reinette Chatre
@ 2025-02-12 16:04 ` Dave Martin
0 siblings, 0 replies; 135+ messages in thread
From: Dave Martin @ 2025-02-12 16:04 UTC (permalink / raw)
To: Reinette Chatre
Cc: James Morse, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Koba Ko, Shanker Donthineni
Hi Reinette,
On Tue, Feb 11, 2025 at 11:29:18AM -0800, Reinette Chatre wrote:
> Hi James,
>
> On 2/11/25 10:37 AM, James Morse wrote:
> > Hi Reinette,
> >
> > On 10/02/2025 17:24, Reinette Chatre wrote:
[...]
> [...] the new version you sent hinted to
> me that you were able to come up for air to give some attention to resctrl upstream and
> I grasped the opportunity to get your input on this item that moved along while you were
> busy elsewhere.
>
> >
> > This is still on my todo list, I think Dave is planning to look through the ABMC series
> > too. I had previously sent some comments, (words to the effect "works for me"), and shared
> > a branch with the MPAM tree rebased on top.
>
> To me the primary concern is the new resctrl files introduced by this work. To get an idea
> of the work and the new files you need only consider Babu's detailed cover letter. These files
> have changed since your previous comments so a fresh "works for me" will be appreciated.
This is roughly what I was planning to try to do, so I will go ahead
with that.
[...]
Cheers
---Dave
^ permalink raw reply [flat|nested] 135+ messages in thread
* RE: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-07 18:17 [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (42 preceding siblings ...)
2025-02-10 17:24 ` [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
@ 2025-02-28 1:15 ` Shaopeng Tan (Fujitsu)
2025-02-28 19:55 ` James Morse
43 siblings, 1 reply; 135+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-02-28 1:15 UTC (permalink / raw)
To: 'James Morse', x86@kernel.org,
linux-kernel@vger.kernel.org
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS, carl@os.amperecomputing.com,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
Hello James
I ran resctrl selftest on Intel(R) Xeon(R) Gold 6338T CPU @ 2.10GHz, there is no problem.
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 135+ messages in thread* Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2025-02-28 1:15 ` Shaopeng Tan (Fujitsu)
@ 2025-02-28 19:55 ` James Morse
0 siblings, 0 replies; 135+ messages in thread
From: James Morse @ 2025-02-28 19:55 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), x86@kernel.org,
linux-kernel@vger.kernel.org
Cc: Reinette Chatre, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS, carl@os.amperecomputing.com,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Shanker Donthineni
Hello!,
On 28/02/2025 01:15, Shaopeng Tan (Fujitsu) wrote:
> I ran resctrl selftest on Intel(R) Xeon(R) Gold 6338T CPU @ 2.10GHz, there is no problem.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Thank-you for your testing work.
James
^ permalink raw reply [flat|nested] 135+ messages in thread