* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver
@ 2024-11-19 13:50 Zeng Heng
2024-11-19 13:50 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 1/6] arm_mpam: Introduce the definitions of intPARTID and reqPARTID Zeng Heng
` (6 more replies)
0 siblings, 7 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:50 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
The patch set is applied for mpam/snapshot/v6.12-rc1 branch of
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
repository.
This patch set is fully compatible with x86 RDT functionality.
The narrow-partid feature in MPAM allows for a more efficient use of
PARTIDs by enabling a many-to-one mapping of reqpartids (requested PARTIDs)
to intpartids (internal PARTIDs). This mapping reduces the number of unique
PARTIDs needed, thus allowing more tasks or processes to be monitored and
managed with the available resources.
Intpartid(Internal PARTID) is an internal identifier used by the hardware
to represent a specific resource partition. It is a low-level identifier
that the hardware uses to track and manage resource allocation and
monitoring.
Reqpartid(Request PARTID) is an identifier provided by the software when
requesting resources from the memory system. It indicates the desired
partition for resource monitoring. By using reqpartids, software can
monitor specific resources or allow the system to subdivide smaller
granularity partitions within existing partitions to serve as monitoring
partitions.
For the new rmid allocation strategy, it will check whether there is an
available rmid of any reqPARTID which belongs to the input intPARTID.
The MPAM driver statically assigns all reqPARTIDs to respective intPARTIDs,
with a specific illustration as follows:
m - Indicates the number of reqPARTIDs per intPARTID
n - Indicates the total number of intPARTIDs
(m * n) - Represents the total number of reqPARTIDs
intPARTID_1 = 0
├── reqPARTID_1_1 = 0
├── reqPARTID_1_2 = 0 + n
├── ...
└── reqPARTID_1_m = 0 + n * (m - 1)
intPARTID_2 = 1
├── reqPARTID_2_1 = 1
├── reqPARTID_2_2 = 1 + n
├── ...
└── reqPARTID_2_m = 1 + n * (m - 1)
...
intPARTID_n = (n - 1)
Each intPARTID has m reqPARTIDs, which are used to expand the number of
monitoring groups under the control group. Therefore, the number of
monitoring groups is no longer limited by the range of MPAM PMG, which
enhances the extensibility of the system's monitoring capabilities.
---
compared with v1:
- Rebase this patch set on latest MPAM driver of the v6.12-rc1 branch.
---
Dave Martin (1):
arm_mpam: Set INTERNAL as needed when setting MSC controls
Zeng Heng (5):
arm_mpam: Introduce the definitions of intPARTID and reqPARTID
arm_mpam: Create reqPARTIDs resource bitmap
arm_mpam: Enhance the rmid allocation strategy
arm_mpam: Call resctrl_sync_config() when allocate new reqPARTID
fs/resctrl: Add the helper to check if the task exists in the target
group
arch/x86/kernel/cpu/resctrl/core.c | 20 +++
drivers/platform/arm64/mpam/mpam_devices.c | 80 +++++++++--
drivers/platform/arm64/mpam/mpam_internal.h | 6 +
drivers/platform/arm64/mpam/mpam_resctrl.c | 145 +++++++++++++++++++-
fs/resctrl/internal.h | 4 -
fs/resctrl/monitor.c | 16 ++-
fs/resctrl/pseudo_lock.c | 7 +-
fs/resctrl/rdtgroup.c | 84 ++++++++----
include/linux/resctrl.h | 30 ++++
9 files changed, 342 insertions(+), 50 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 1/6] arm_mpam: Introduce the definitions of intPARTID and reqPARTID
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
@ 2024-11-19 13:50 ` Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 2/6] arm_mpam: Create reqPARTIDs resource bitmap Zeng Heng
` (5 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:50 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
The narrow-partid feature in MPAM allows for a more efficient use of
PARTIDs by enabling a many-to-one mapping of reqpartids (requested PARTIDs)
to intpartids (internal PARTIDs). This mapping reduces the number of unique
PARTIDs needed, thus allowing more tasks or processes to be monitored and
managed with the available resources.
Intpartid(Internal PARTID) is an internal identifier used by the hardware
to represent a specific resource partition. It is a low-level identifier
that the hardware uses to track and manage resource allocation and
monitoring.
Reqpartid(Request PARTID) is an identifier provided by the software when
requesting resources from the memory system. It indicates the desired
partition for resource monitoring. By using reqpartids, software can
monitor specific resources or allow the system to subdivide smaller
granularity partitions within existing partitions to serve as monitoring
partitions.
Regarding intPARTID, MPAM uses it as the unit for control group
configuration delivery. MPAM will synchronize the delivered configuration
to all reqPARTIDs mapped to the same intPARTIDs. The number of intPARTIDs
is indicated by MPAMF_PARTID_NRW_IDR.INTPARTID_MAX if implemented, or
directly use the number of PARTID as intpartid_max.
reqPARTIDs can be used to expand the number of monitors, for each control
group is no longer simply restricted by the range of PMG. By mapping
between intPARTID and reqPARTID, the number of monitors would be greatly
expanded and more fine-grained monitoring under a control group will be
achieved.
After initialization, MPAM driver would select the minimum intpartid_max
among all classes of MSCs as the number of CLOSIDs, and use partid_max as
the number of reqPARTIDs.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 12 +++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 2 ++
drivers/platform/arm64/mpam/mpam_resctrl.c | 9 +++++++--
3 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 9463045c0011..ca621bb132e9 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -67,6 +67,7 @@ static DEFINE_MUTEX(mpam_cpuhp_state_lock);
* Generating traffic outside this range will result in screaming interrupts.
*/
u16 mpam_partid_max;
+u16 mpam_intpartid_max;
u8 mpam_pmg_max;
static bool partid_max_init, partid_max_published;
static DEFINE_SPINLOCK(partid_max_lock);
@@ -222,10 +223,16 @@ int mpam_register_requestor(u16 partid_max, u8 pmg_max)
spin_lock(&partid_max_lock);
if (!partid_max_init) {
mpam_partid_max = partid_max;
+ /*
+ * Update mpam_intpartid_max here, in case the
+ * system doesn't have narrow-partid feature.
+ */
+ mpam_intpartid_max = partid_max;
mpam_pmg_max = pmg_max;
partid_max_init = true;
} else if (!partid_max_published) {
mpam_partid_max = min(mpam_partid_max, partid_max);
+ mpam_intpartid_max = min(mpam_intpartid_max, partid_max);
mpam_pmg_max = min(mpam_pmg_max, pmg_max);
} else {
/* New requestors can't lower the values */
@@ -984,7 +991,9 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
u16 partid_max = FIELD_GET(MPAMF_PARTID_NRW_IDR_INTPARTID_MAX, nrwidr);
mpam_set_feature(mpam_feat_partid_nrw, props);
- msc->partid_max = min(msc->partid_max, partid_max);
+ msc->intpartid_max = min(msc->partid_max, partid_max);
+ } else {
+ msc->intpartid_max = msc->partid_max;
}
mpam_mon_sel_outer_unlock(msc);
@@ -1046,6 +1055,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
spin_lock(&partid_max_lock);
mpam_partid_max = min(mpam_partid_max, msc->partid_max);
+ mpam_intpartid_max = min(mpam_intpartid_max, msc->intpartid_max);
mpam_pmg_max = min(mpam_pmg_max, msc->pmg_max);
spin_unlock(&partid_max_lock);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 5af6ed60272e..5fc9f09b6945 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -86,6 +86,7 @@ struct mpam_msc
bool error_irq_requested;
bool error_irq_hw_enabled;
u16 partid_max;
+ u16 intpartid_max;
u8 pmg_max;
unsigned long ris_idxs[128 / BITS_PER_LONG];
u32 ris_max;
@@ -466,6 +467,7 @@ static inline void mpam_assert_srcu_read_lock_held(void)
/* System wide partid/pmg values */
extern u16 mpam_partid_max;
+extern u16 mpam_intpartid_max;
extern u8 mpam_pmg_max;
/* Scheduled work callback to enable mpam once all MSC have been probed */
diff --git a/drivers/platform/arm64/mpam/mpam_resctrl.c b/drivers/platform/arm64/mpam/mpam_resctrl.c
index 76ddbce0ea9c..ac3d228befcf 100644
--- a/drivers/platform/arm64/mpam/mpam_resctrl.c
+++ b/drivers/platform/arm64/mpam/mpam_resctrl.c
@@ -162,6 +162,11 @@ static bool mpam_resctrl_hide_cdp(enum resctrl_res_level rid)
* only the system wide safe value is safe to use.
*/
u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
+{
+ return mpam_intpartid_max + 1;
+}
+
+static u32 get_num_reqpartid(void)
{
return mpam_partid_max + 1;
}
@@ -169,9 +174,9 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
u32 resctrl_arch_system_num_rmid_idx(void)
{
u8 closid_shift = fls(mpam_pmg_max);
- u32 num_partid = resctrl_arch_get_num_closid(NULL);
+ u32 num_reqpartid = get_num_reqpartid();
- return num_partid << closid_shift;
+ return num_reqpartid << closid_shift;
}
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 2/6] arm_mpam: Create reqPARTIDs resource bitmap
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
2024-11-19 13:50 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 1/6] arm_mpam: Introduce the definitions of intPARTID and reqPARTID Zeng Heng
@ 2024-11-19 13:51 ` Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 3/6] arm_mpam: Enhance the rmid allocation strategy Zeng Heng
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:51 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
The driver checks whether mpam_partid_max and mpam_intpartid_max are equal
as a basis for supporting reqPARTID. If this feature is supported, use a
bitmap to represent whether the target reqPARTID is available or not.
Create the bitmap during monitor initialization, and the destructor is
called during the monitor exit process.
It is noted that the reqpartid_free_map reserves the first reqPARTID under
each intPARTID (which is equal to the corresponding intPARTID itself). By
default, assigns it to the corresponding control group for use in
monitoring.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/platform/arm64/mpam/mpam_resctrl.c | 33 ++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_resctrl.c b/drivers/platform/arm64/mpam/mpam_resctrl.c
index ac3d228befcf..dfd4125875d7 100644
--- a/drivers/platform/arm64/mpam/mpam_resctrl.c
+++ b/drivers/platform/arm64/mpam/mpam_resctrl.c
@@ -1005,6 +1005,33 @@ static void mpam_resctrl_monitor_init(struct mpam_class *class,
return;
}
+static unsigned long *reqpartid_free_map;
+static int reqpartid_free_map_len;
+
+static int reqpartid_create(void)
+{
+ u32 reqpartid_num = get_num_reqpartid();
+ int i;
+
+ reqpartid_free_map = bitmap_alloc(reqpartid_num, GFP_KERNEL);
+ if (!reqpartid_free_map)
+ return -ENOMEM;
+
+ bitmap_fill(reqpartid_free_map, reqpartid_num);
+
+ /* Reserved for the internal partIDs mapping */
+ for (i = 0; i < resctrl_arch_get_num_closid(NULL); i++)
+ __clear_bit(i, reqpartid_free_map);
+
+ reqpartid_free_map_len = reqpartid_num;
+ return 0;
+}
+
+static void reqpartid_destroy(void)
+{
+ bitmap_free(reqpartid_free_map);
+}
+
int mpam_resctrl_setup(void)
{
int err = 0;
@@ -1052,6 +1079,10 @@ int mpam_resctrl_setup(void)
cpus_read_unlock();
+ err = reqpartid_create();
+ if (err)
+ return err;
+
if (!err && !exposed_alloc_capable && !exposed_mon_capable)
err = -EOPNOTSUPP;
@@ -1080,6 +1111,8 @@ static void mpam_resctrl_exit(void)
WRITE_ONCE(resctrl_enabled, false);
resctrl_exit();
+
+ reqpartid_destroy();
}
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 3/6] arm_mpam: Enhance the rmid allocation strategy
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
2024-11-19 13:50 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 1/6] arm_mpam: Introduce the definitions of intPARTID and reqPARTID Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 2/6] arm_mpam: Create reqPARTIDs resource bitmap Zeng Heng
@ 2024-11-19 13:51 ` Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 4/6] arm_mpam: Set INTERNAL as needed when setting MSC controls Zeng Heng
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:51 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
resctrl_arch_alloc_rmid() provides the similar functionality as
alloc_rmid(), but it is an enhanced version based on different architecture
implementations.
For the new rmid allocation strategy, it will check whether there is an
available rmid of any reqPARTID which belongs to the input intPARTID.
The MPAM driver statically assigns all reqPARTIDs to respective intPARTIDs,
with a specific illustration as follows:
m - Indicates the number of reqPARTIDs per intPARTID
n - Indicates the total number of intPARTIDs
(m * n) - Represents the total number of reqPARTIDs
intPARTID_1 = 0
├── reqPARTID_1_1 = 0
├── reqPARTID_1_2 = 0 + n
├── ...
└── reqPARTID_1_m = 0 + n * (m - 1)
intPARTID_2 = 1
├── reqPARTID_2_1 = 1
├── reqPARTID_2_2 = 1 + n
├── ...
└── reqPARTID_2_m = 1 + n * (m - 1)
...
intPARTID_n = (n - 1)
Each intPARTID has m reqPARTIDs, which are used to expand the number of
monitoring groups under the control group. Therefore, the number of
monitoring groups is no longer limited by the range of MPAM PMG, which
enhances the extensibility of the system's monitoring capabilities.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 15 ++++++++
drivers/platform/arm64/mpam/mpam_resctrl.c | 41 ++++++++++++++++++++--
fs/resctrl/internal.h | 2 --
fs/resctrl/pseudo_lock.c | 3 +-
fs/resctrl/rdtgroup.c | 3 +-
include/linux/resctrl.h | 16 +++++++++
6 files changed, 71 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 816d9af6b36b..f9195f1bece0 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -697,6 +697,21 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
domain_remove_cpu_mon(cpu, r);
}
+int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid)
+{
+ int ret;
+
+ if (!closid || !rmid)
+ return -EINVAL;
+
+ ret = alloc_rmid(*closid);
+ if (ret < 0)
+ return ret;
+
+ *rmid = ret;
+ return 0;
+}
+
static void clear_closid_rmid(int cpu)
{
struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
diff --git a/drivers/platform/arm64/mpam/mpam_resctrl.c b/drivers/platform/arm64/mpam/mpam_resctrl.c
index dfd4125875d7..f0fb76b2424a 100644
--- a/drivers/platform/arm64/mpam/mpam_resctrl.c
+++ b/drivers/platform/arm64/mpam/mpam_resctrl.c
@@ -171,6 +171,11 @@ static u32 get_num_reqpartid(void)
return mpam_partid_max + 1;
}
+static u32 get_num_reqpartid_per_closid(void)
+{
+ return get_num_reqpartid() / resctrl_arch_get_num_closid(NULL);
+}
+
u32 resctrl_arch_system_num_rmid_idx(void)
{
u8 closid_shift = fls(mpam_pmg_max);
@@ -182,10 +187,11 @@ u32 resctrl_arch_system_num_rmid_idx(void)
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
{
u8 closid_shift = fls(mpam_pmg_max);
+ u32 pmg_mask = ~(~0 << closid_shift);
BUG_ON(closid_shift > 8);
- return (closid << closid_shift) | rmid;
+ return (closid << closid_shift) | (rmid & pmg_mask);
}
void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
@@ -195,8 +201,10 @@ void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
BUG_ON(closid_shift > 8);
- *closid = idx >> closid_shift;
- *rmid = idx & pmg_mask;
+ if (closid)
+ *closid = idx >> closid_shift;
+ if (rmid)
+ *rmid = idx & pmg_mask;
}
void resctrl_arch_sched_in(struct task_struct *tsk)
@@ -1032,6 +1040,33 @@ static void reqpartid_destroy(void)
bitmap_free(reqpartid_free_map);
}
+int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid)
+{
+ int closid_num = resctrl_arch_get_num_closid(NULL);
+ int i, reqpartid, pmg;
+
+ if (!closid || !rmid)
+ return -EINVAL;
+
+ /* The closid is out of the range of intPARTIDs */
+ if (*closid >= closid_num)
+ return -EINVAL;
+
+ for (i = 0; i < get_num_reqpartid_per_closid(); i++) {
+ reqpartid = i * closid_num + *closid;
+ pmg = alloc_rmid(reqpartid);
+ if (pmg >= 0)
+ break;
+ }
+
+ if (pmg < 0)
+ return pmg;
+
+ *closid = reqpartid;
+ *rmid = pmg;
+ return 0;
+}
+
int mpam_resctrl_setup(void)
{
int err = 0;
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index a4288997fb93..f817e67a8cfc 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -325,8 +325,6 @@ int closids_supported(void);
void closid_free(int closid);
-int alloc_rmid(u32 closid);
-
void free_rmid(u32 closid, u32 rmid);
void resctrl_mon_resource_exit(void);
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
index 65f3ae15cdfa..ab8d7a36d303 100644
--- a/fs/resctrl/pseudo_lock.c
+++ b/fs/resctrl/pseudo_lock.c
@@ -582,12 +582,11 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
int ret;
if (resctrl_arch_mon_capable()) {
- ret = alloc_rmid(rdtgrp->closid);
+ ret = resctrl_arch_alloc_rmid(&rdtgrp->closid, &rdtgrp->mon.rmid);
if (ret < 0) {
rdt_last_cmd_puts("Out of RMIDs\n");
return ret;
}
- rdtgrp->mon.rmid = ret;
}
ret = rdtgroup_locksetup_user_restore(rdtgrp);
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 30f1dd05e10b..7a71fc0b76b5 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -3272,12 +3272,11 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
if (!resctrl_arch_mon_capable())
return 0;
- ret = alloc_rmid(rdtgrp->closid);
+ ret = resctrl_arch_alloc_rmid(&rdtgrp->closid, &rdtgrp->mon.rmid);
if (ret < 0) {
rdt_last_cmd_puts("Out of RMIDs\n");
return ret;
}
- rdtgrp->mon.rmid = ret;
ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
if (ret) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index be9d9f4c7b81..4fffbe8cd22e 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -506,6 +506,22 @@ static inline void resctrl_arch_rmid_read_context_check(void)
might_sleep();
}
+int alloc_rmid(u32 closid);
+/**
+ * resctrl_arch_alloc_rmid() - Providing the similar functionality as
+ * alloc_rmid, but this function is an
+ * enhanced version based on different
+ * architecture implementations.
+ * @closid: closid that matches the rmid. Depending on the
+ * architecture, may update new closid and return by the
+ * pointer.
+ * @rmid: update available rmid by the pointer.
+ *
+ * Return:
+ * 0 on success, or -EINVAL, -ENOSPC etc on error.
+ */
+int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid);
+
/**
* resctrl_arch_reset_rmid() - Reset any private state associated with rmid
* and eventid.
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 4/6] arm_mpam: Set INTERNAL as needed when setting MSC controls
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
` (2 preceding siblings ...)
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 3/6] arm_mpam: Enhance the rmid allocation strategy Zeng Heng
@ 2024-11-19 13:51 ` Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 5/6] arm_mpam: Call resctrl_sync_config() when allocate new reqPARTID Zeng Heng
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:51 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
From: Dave Martin <Dave.Martin@arm.com>
Currently, when an MSC implements PARTID narrowing, MPAMCFG_PART_SEL is
left set to the reqPARTID while while programming resource controls in
an MSC.
The MPAM architecture does not guarantee that any particular resource
controls will be updated correctly in this scenario. Instead,
MPAMCFG_PART_SEL must be written with the corresponding intPARTID and
with the INTERNAL bit set before attempting to program resource controls.
Only the PARTID->intPARTID mappings can be written without the INTERNAL
bit set in MPAMCFG_PART_SEL.
Fix it, by rewriting MPAMCFG_PART_SEL appropriately after setting the
intPARTID mapping.
The MPAMCFG_INTPARTID_INTERNAL flag is not currently accepted as
input to the __mpam_part_sel() helper. In the interest of keeping
the code clean, break this helper up.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 26 +++++++++++++++++-----
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index ca621bb132e9..781c9146718d 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -205,15 +205,27 @@ static u64 mpam_msc_read_esr(struct mpam_msc *msc)
return (esr_high << 32) | esr_low;
}
+static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc)
+{
+ lockdep_assert_held(&msc->part_sel_lock);
+ mpam_write_partsel_reg(msc, PART_SEL, partsel);
+}
+
static void __mpam_part_sel(u8 ris_idx, u16 partid, struct mpam_msc *msc)
{
- u32 partsel;
+ u32 partsel = FIELD_PREP(MPAMCFG_PART_SEL_RIS, ris_idx) |
+ FIELD_PREP(MPAMCFG_PART_SEL_PARTID_SEL, partid);
- lockdep_assert_held(&msc->part_sel_lock);
+ __mpam_part_sel_raw(partsel, msc);
+}
- partsel = FIELD_PREP(MPAMCFG_PART_SEL_RIS, ris_idx) |
- FIELD_PREP(MPAMCFG_PART_SEL_PARTID_SEL, partid);
- mpam_write_partsel_reg(msc, PART_SEL, partsel);
+static void __mpam_intpart_sel(u8 ris_idx, u16 intpartid, struct mpam_msc *msc)
+{
+ u32 partsel = FIELD_PREP(MPAMCFG_PART_SEL_RIS, ris_idx) |
+ FIELD_PREP(MPAMCFG_PART_SEL_PARTID_SEL, intpartid) |
+ MPAMCFG_PART_SEL_INTERNAL;
+
+ __mpam_part_sel_raw(partsel, msc);
}
int mpam_register_requestor(u16 partid_max, u8 pmg_max)
@@ -1540,9 +1552,11 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
mutex_lock(&msc->part_sel_lock);
__mpam_part_sel(ris->ris_idx, partid, msc);
- if (mpam_has_feature(mpam_feat_partid_nrw, rprops))
+ if (mpam_has_feature(mpam_feat_partid_nrw, rprops)) {
mpam_write_partsel_reg(msc, INTPARTID,
MPAMCFG_INTPARTID_INTERNAL | partid);
+ __mpam_intpart_sel(ris->ris_idx, partid, msc);
+ }
if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
if (mpam_has_feature(mpam_feat_cpor_part, cfg))
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 5/6] arm_mpam: Call resctrl_sync_config() when allocate new reqPARTID
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
` (3 preceding siblings ...)
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 4/6] arm_mpam: Set INTERNAL as needed when setting MSC controls Zeng Heng
@ 2024-11-19 13:51 ` Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 6/6] fs/resctrl: Add the helper to check if the task exists in the target group Zeng Heng
2024-11-19 15:31 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Dave Martin
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:51 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
Add resctrl_sync_config() for establishing new mapping between intPARTID
and reqPARTID for hardware or synchronizing their configuration by
software.
For the expansion of monitors using reqPARTID in the system:
1. If a MSC class supports the narrow-partid feature, then it is only
necessary to associate the newly allocated reqPARTID with its
corresponding intPARTID.
2. If a MSC class does not support the narrow-partid feature, the
hardware can't associate reqPARTID with intPARTID, and the software
needs to synchronize the configuration of the parent control group to
the new PARTID (the newly allocated "reqpartid").
Additionally, for MSC class that doesn't support the narrow-partid feature,
when the control group configuration is updated, the software needs to
forcibly update its sub-monitoring groups (referencing the
__write_config()). We establish a reqPARTID resource bitmap and check
whether the reqPARTID is in use by reqpartid_is_busy() to reduce
unnecessary I/O operations.
In mpam_apply_config(), if closid is equal to intPARTID, we regard that
it is updating new configuration for the control group. Otherwise, it is
attempting to establish mapping and synchronize configuration for
sub-monitoring groups.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 5 ++
drivers/platform/arm64/mpam/mpam_devices.c | 44 +++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 4 ++
drivers/platform/arm64/mpam/mpam_resctrl.c | 66 ++++++++++++++++++++-
fs/resctrl/internal.h | 2 -
fs/resctrl/monitor.c | 16 ++++-
fs/resctrl/pseudo_lock.c | 4 +-
fs/resctrl/rdtgroup.c | 29 +++++++--
include/linux/resctrl.h | 14 +++++
9 files changed, 168 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index f9195f1bece0..beafe793674f 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -712,6 +712,11 @@ int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid)
return 0;
}
+void resctrl_arch_free_rmid(u32 closid, u32 rmid)
+{
+ free_rmid(closid, rmid);
+}
+
static void clear_closid_rmid(int cpu)
{
struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 781c9146718d..fcb98af447a8 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -1544,6 +1544,7 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
u32 pri_val = 0;
u16 cmax = MPAMCFG_CMAX_CMAX;
u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
+ u16 intpartid = req2intpartid(partid);
struct mpam_msc *msc = ris->vmsc->msc;
struct mpam_props *rprops = &ris->props;
u16 dspri = GENMASK(rprops->dspri_wd, 0);
@@ -1554,8 +1555,14 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
if (mpam_has_feature(mpam_feat_partid_nrw, rprops)) {
mpam_write_partsel_reg(msc, INTPARTID,
- MPAMCFG_INTPARTID_INTERNAL | partid);
- __mpam_intpart_sel(ris->ris_idx, partid, msc);
+ MPAMCFG_INTPARTID_INTERNAL |
+ intpartid);
+
+ /* Already finish mapping reqPARTID to intPARTID */
+ if (partid != intpartid)
+ goto out;
+
+ __mpam_intpart_sel(ris->ris_idx, intpartid, msc);
}
if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
@@ -1615,6 +1622,7 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
mpam_quirk_post_config_change(ris, partid, cfg);
+out:
mutex_unlock(&msc->part_sel_lock);
}
@@ -3072,10 +3080,34 @@ struct mpam_write_config_arg {
static int __write_config(void *arg)
{
+ int closid_num = resctrl_arch_get_num_closid(NULL);
struct mpam_write_config_arg *c = arg;
+ struct mpam_props *rprops;
+ u32 reqpartid;
+ int i;
mpam_reprogram_ris_partid(c->ris, c->partid, &c->comp->cfg[c->partid]);
+ if (c->partid != req2intpartid(c->partid))
+ return 0;
+
+ rprops = &c->ris->props;
+
+ /*
+ * Synchronize the configuration to each working sub-monitoring
+ * group.
+ */
+ for (i = 1; i < get_num_reqpartid_per_closid(); i++) {
+ reqpartid = i * closid_num + c->partid;
+
+ if (!reqpartid_is_busy(reqpartid))
+ continue;
+
+ if (!mpam_has_feature(mpam_feat_partid_nrw, rprops))
+ mpam_reprogram_ris_partid(c->ris, reqpartid,
+ &c->comp->cfg[c->partid]);
+ }
+
return 0;
}
@@ -3149,6 +3181,7 @@ static void mpam_extend_config(struct mpam_class *class, struct mpam_config *cfg
int mpam_apply_config(struct mpam_component *comp, u16 partid,
struct mpam_config *cfg)
{
+ int intpartid = req2intpartid(partid);
struct mpam_write_config_arg arg;
struct mpam_msc_ris *ris;
struct mpam_vmsc *vmsc;
@@ -3159,7 +3192,12 @@ int mpam_apply_config(struct mpam_component *comp, u16 partid,
mpam_extend_config(comp->class, cfg);
- if (!mpam_update_config(&comp->cfg[partid], cfg))
+ /*
+ * If the partid is not equal to its intpartid, then we continue
+ * to establish the association between reqpartid and intpartid.
+ */
+ if ((intpartid == partid) &&
+ !mpam_update_config(&comp->cfg[intpartid], cfg))
return 0;
arg.comp = comp;
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 5fc9f09b6945..7b5aa6406946 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -773,4 +773,8 @@ static inline void mpam_resctrl_teardown_class(struct mpam_class *class) { }
*/
#define MSMON_CAPT_EVNT_NOW BIT(0)
+u32 get_num_reqpartid_per_closid(void);
+u32 req2intpartid(u32 reqpartid);
+bool reqpartid_is_busy(int reqpartid);
+
#endif /* MPAM_INTERNAL_H */
diff --git a/drivers/platform/arm64/mpam/mpam_resctrl.c b/drivers/platform/arm64/mpam/mpam_resctrl.c
index f0fb76b2424a..04478857feae 100644
--- a/drivers/platform/arm64/mpam/mpam_resctrl.c
+++ b/drivers/platform/arm64/mpam/mpam_resctrl.c
@@ -171,11 +171,16 @@ static u32 get_num_reqpartid(void)
return mpam_partid_max + 1;
}
-static u32 get_num_reqpartid_per_closid(void)
+u32 get_num_reqpartid_per_closid(void)
{
return get_num_reqpartid() / resctrl_arch_get_num_closid(NULL);
}
+u32 req2intpartid(u32 reqpartid)
+{
+ return reqpartid % resctrl_arch_get_num_closid(NULL);
+}
+
u32 resctrl_arch_system_num_rmid_idx(void)
{
u8 closid_shift = fls(mpam_pmg_max);
@@ -1040,10 +1045,26 @@ static void reqpartid_destroy(void)
bitmap_free(reqpartid_free_map);
}
+static int reqpartid_alloc(int reqpartid)
+{
+ __clear_bit(reqpartid, reqpartid_free_map);
+ return reqpartid;
+}
+
+static void reqpartid_free(int reqpartid)
+{
+ __set_bit(reqpartid, reqpartid_free_map);
+}
+
+bool reqpartid_is_busy(int reqpartid)
+{
+ return !test_bit(reqpartid, reqpartid_free_map);
+}
+
int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid)
{
int closid_num = resctrl_arch_get_num_closid(NULL);
- int i, reqpartid, pmg;
+ int i, ret, reqpartid, pmg;
if (!closid || !rmid)
return -EINVAL;
@@ -1064,7 +1085,48 @@ int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid)
*closid = reqpartid;
*rmid = pmg;
+
+ if (reqpartid_is_busy(reqpartid))
+ return 0;
+
+ ret = reqpartid_alloc(reqpartid);
+ if (ret < 0)
+ goto out_free_rmid;
+
+ ret = resctrl_sync_config(reqpartid);
+ if (ret < 0)
+ goto out;
+
return 0;
+
+out:
+ reqpartid_free(reqpartid);
+out_free_rmid:
+ free_rmid(reqpartid, pmg);
+ return ret;
+}
+
+void resctrl_arch_free_rmid(u32 reqpartid, u32 pmg)
+{
+ int i;
+
+ free_rmid(reqpartid, pmg);
+
+ /* Always reserve the reqPARTID which equals to its intPARTID. */
+ if (reqpartid == req2intpartid(reqpartid))
+ return;
+
+ /*
+ * Check whether every RMID belongs to this reqPARTID is freed or
+ * not. If all belonging rmids are freed, also free the reqPARTID
+ * self.
+ */
+ for (i = 0; i <= mpam_pmg_max; i++) {
+ if (rmid_is_busy(reqpartid, i))
+ return;
+ }
+
+ reqpartid_free(reqpartid);
}
int mpam_resctrl_setup(void)
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index f817e67a8cfc..74edea42e44f 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -325,8 +325,6 @@ int closids_supported(void);
void closid_free(int closid);
-void free_rmid(u32 closid, u32 rmid);
-
void resctrl_mon_resource_exit(void);
void mon_event_count(void *info);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index d9041d134d54..8258ceaddd24 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -282,10 +282,24 @@ int alloc_rmid(u32 closid)
if (IS_ERR(entry))
return PTR_ERR(entry);
- list_del(&entry->list);
+ list_del_init(&entry->list);
return entry->rmid;
}
+bool rmid_is_busy(u32 closid, u32 rmid)
+{
+ u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+ struct rmid_entry *entry;
+
+ entry = __rmid_entry(idx);
+
+ /*
+ * If it's not in the rmid_free_lru list,
+ * the rmid is working.
+ */
+ return list_empty(&entry->list);
+}
+
static void add_rmid_to_limbo(struct rmid_entry *entry)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
index ab8d7a36d303..47951ff8d8d1 100644
--- a/fs/resctrl/pseudo_lock.c
+++ b/fs/resctrl/pseudo_lock.c
@@ -557,7 +557,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* anymore when this group would be used for pseudo-locking. This
* is safe to call on platforms not capable of monitoring.
*/
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
ret = 0;
goto out;
@@ -591,7 +591,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
ret = rdtgroup_locksetup_user_restore(rdtgrp);
if (ret) {
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
return ret;
}
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 7a71fc0b76b5..7ebf4bf75c94 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2746,7 +2746,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
head = &rdtgrp->mon.crdtgrp_list;
list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
- free_rmid(sentry->closid, sentry->mon.rmid);
+ resctrl_arch_free_rmid(sentry->closid, sentry->mon.rmid);
list_del(&sentry->mon.crdtgrp_list);
if (atomic_read(&sentry->waitcount) != 0)
@@ -2786,7 +2786,7 @@ static void rmdir_all_sub(void)
cpumask_or(&rdtgroup_default.cpu_mask,
&rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
kernfs_remove(rdtgrp->kn);
list_del(&rdtgrp->rdtgroup_list);
@@ -3281,7 +3281,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
if (ret) {
rdt_last_cmd_puts("kernfs subdir error\n");
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
return ret;
}
@@ -3291,7 +3291,24 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
{
if (resctrl_arch_mon_capable())
- free_rmid(rgrp->closid, rgrp->mon.rmid);
+ resctrl_arch_free_rmid(rgrp->closid, rgrp->mon.rmid);
+}
+
+int resctrl_sync_config(u32 closid)
+{
+ struct resctrl_schema *s;
+ int ret;
+
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ ret = resctrl_arch_update_domains(s->res, closid);
+ if (ret < 0) {
+ rdt_last_cmd_puts("Failed to synchronize partitions\n");
+ goto out;
+ }
+ }
+
+out:
+ return ret;
}
static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
@@ -3557,7 +3574,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
update_closid_rmid(tmpmask, NULL);
rdtgrp->flags = RDT_DELETED;
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
/*
* Remove the rdtgrp from the parent ctrl_mon group's list
@@ -3604,7 +3621,7 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
update_closid_rmid(tmpmask, NULL);
- free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
+ resctrl_arch_free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
closid_free(rdtgrp->closid);
rdtgroup_ctrl_remove(rdtgrp);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4fffbe8cd22e..8f7ddbeab68f 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -507,6 +507,9 @@ static inline void resctrl_arch_rmid_read_context_check(void)
}
int alloc_rmid(u32 closid);
+void free_rmid(u32 closid, u32 rmid);
+bool rmid_is_busy(u32 closid, u32 rmid);
+
/**
* resctrl_arch_alloc_rmid() - Providing the similar functionality as
* alloc_rmid, but this function is an
@@ -521,6 +524,17 @@ int alloc_rmid(u32 closid);
* 0 on success, or -EINVAL, -ENOSPC etc on error.
*/
int resctrl_arch_alloc_rmid(u32 *closid, u32 *rmid);
+/**
+ * resctrl_arch_free_rmid() - Manage rmid and reqpartid resources in
+ * conjunction with the
+ * resctrl_arch_alloc_rmid().
+ *
+ * @closid: closid that matches the rmid.
+ * @rmid: The rmid would be free.
+ */
+void resctrl_arch_free_rmid(u32 closid, u32 rmid);
+
+int resctrl_sync_config(u32 closid);
/**
* resctrl_arch_reset_rmid() - Reset any private state associated with rmid
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 6/6] fs/resctrl: Add the helper to check if the task exists in the target group
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
` (4 preceding siblings ...)
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 5/6] arm_mpam: Call resctrl_sync_config() when allocate new reqPARTID Zeng Heng
@ 2024-11-19 13:51 ` Zeng Heng
2024-11-19 15:31 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Dave Martin
6 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-11-19 13:51 UTC (permalink / raw)
To: james.morse, Dave.Martin
Cc: linux-kernel, jonathan.cameron, xiexiuqi, linux-arm-kernel
After introducing the extension for monitoring feature, to check if a
closid exists within the target control group, it is not only check if the
closid of the control group is the same, but also to recursively check if
there is a closid of sub-monitor groups is the same. Therefore, a new
helper task_belongs_to_ctrl_group() is added.
On the x86 side, the closid of the child monitor group is the same as its
parent control group's (only the rmid is different), hence
resctrl_arch_match_rmid() can directly use the closid of the child monitor
group. Meanwhile, task_belongs_to_ctrl_group() can replace
resctrl_arch_match_closid() to be compatible with the x86 arch.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
fs/resctrl/rdtgroup.c | 52 +++++++++++++++++++++++++++----------------
1 file changed, 33 insertions(+), 19 deletions(-)
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 7ebf4bf75c94..7a188f3cceeb 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -589,19 +589,38 @@ static void update_task_closid_rmid(struct task_struct *t)
static bool task_in_rdtgroup(struct task_struct *tsk, struct rdtgroup *rdtgrp)
{
- u32 closid, rmid = rdtgrp->mon.rmid;
+ u32 closid, rmid;
- if (rdtgrp->type == RDTCTRL_GROUP)
- closid = rdtgrp->closid;
- else if (rdtgrp->type == RDTMON_GROUP)
- closid = rdtgrp->mon.parent->closid;
- else
- return false;
+ closid = rdtgrp->closid;
+ rmid = rdtgrp->mon.rmid;
return resctrl_arch_match_closid(tsk, closid) &&
resctrl_arch_match_rmid(tsk, closid, rmid);
}
+/**
+ * task_belongs_to_ctrl_group - the helper to check if the task exists in
+ * the target control group.
+ * @tsk: task to be checked
+ * @rdtgrp: target control group
+ */
+static bool task_belongs_to_ctrl_group(struct task_struct *tsk, struct rdtgroup *rdtgrp)
+{
+ struct rdtgroup *crdtgrp;
+
+ /* Check whether exists in contrl group self */
+ if (resctrl_arch_match_closid(tsk, rdtgrp->closid))
+ return true;
+
+ /* Check if exists in one of children monitor groups */
+ list_for_each_entry(crdtgrp, &rdtgrp->mon.crdtgrp_list, mon.crdtgrp_list) {
+ if (resctrl_arch_match_closid(tsk, crdtgrp->closid))
+ return true;
+ }
+
+ return false;
+}
+
static int __rdtgroup_move_task(struct task_struct *tsk,
struct rdtgroup *rdtgrp)
{
@@ -618,17 +637,13 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
* their parent CTRL group.
*/
if (rdtgrp->type == RDTMON_GROUP &&
- !resctrl_arch_match_closid(tsk, rdtgrp->mon.parent->closid)) {
+ !task_belongs_to_ctrl_group(tsk, rdtgrp->mon.parent)) {
rdt_last_cmd_puts("Can't move task to different control group\n");
return -EINVAL;
}
- if (rdtgrp->type == RDTMON_GROUP)
- resctrl_arch_set_closid_rmid(tsk, rdtgrp->mon.parent->closid,
- rdtgrp->mon.rmid);
- else
- resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid,
- rdtgrp->mon.rmid);
+ resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid,
+ rdtgrp->mon.rmid);
/*
* Ensure the task's closid and rmid are written before determining if
@@ -652,14 +667,13 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)
{
return (resctrl_arch_alloc_capable() && (r->type == RDTCTRL_GROUP) &&
- resctrl_arch_match_closid(t, r->closid));
+ task_belongs_to_ctrl_group(t, r));
}
static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r)
{
return (resctrl_arch_mon_capable() && (r->type == RDTMON_GROUP) &&
- resctrl_arch_match_rmid(t, r->mon.parent->closid,
- r->mon.rmid));
+ resctrl_arch_match_rmid(t, r->closid, r->mon.rmid));
}
/**
@@ -909,7 +923,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns,
rdtg->mode != RDT_MODE_EXCLUSIVE)
continue;
- if (!resctrl_arch_match_closid(tsk, rdtg->closid))
+ if (!task_belongs_to_ctrl_group(tsk, rdtg))
continue;
seq_printf(s, "res:%s%s\n", (rdtg == &rdtgroup_default) ? "/" : "",
@@ -917,7 +931,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns,
seq_puts(s, "mon:");
list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
mon.crdtgrp_list) {
- if (!resctrl_arch_match_rmid(tsk, crg->mon.parent->closid,
+ if (!resctrl_arch_match_rmid(tsk, crg->closid,
crg->mon.rmid))
continue;
seq_printf(s, "%s", crg->kn->name);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
` (5 preceding siblings ...)
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 6/6] fs/resctrl: Add the helper to check if the task exists in the target group Zeng Heng
@ 2024-11-19 15:31 ` Dave Martin
2024-11-23 9:34 ` Zeng Heng
6 siblings, 1 reply; 11+ messages in thread
From: Dave Martin @ 2024-11-19 15:31 UTC (permalink / raw)
To: Zeng Heng
Cc: james.morse, linux-kernel, jonathan.cameron, xiexiuqi,
linux-arm-kernel
Hi,
On Tue, Nov 19, 2024 at 09:50:58PM +0800, Zeng Heng wrote:
> The patch set is applied for mpam/snapshot/v6.12-rc1 branch of
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> repository.
>
> This patch set is fully compatible with x86 RDT functionality.
>
> The narrow-partid feature in MPAM allows for a more efficient use of
> PARTIDs by enabling a many-to-one mapping of reqpartids (requested PARTIDs)
> to intpartids (internal PARTIDs). This mapping reduces the number of unique
> PARTIDs needed, thus allowing more tasks or processes to be monitored and
> managed with the available resources.
>
> Intpartid(Internal PARTID) is an internal identifier used by the hardware
> to represent a specific resource partition. It is a low-level identifier
> that the hardware uses to track and manage resource allocation and
> monitoring.
>
> Reqpartid(Request PARTID) is an identifier provided by the software when
> requesting resources from the memory system. It indicates the desired
> partition for resource monitoring. By using reqpartids, software can
> monitor specific resources or allow the system to subdivide smaller
> granularity partitions within existing partitions to serve as monitoring
> partitions.
>
> For the new rmid allocation strategy, it will check whether there is an
> available rmid of any reqPARTID which belongs to the input intPARTID.
>
> The MPAM driver statically assigns all reqPARTIDs to respective intPARTIDs,
> with a specific illustration as follows:
>
> m - Indicates the number of reqPARTIDs per intPARTID
> n - Indicates the total number of intPARTIDs
> (m * n) - Represents the total number of reqPARTIDs
>
> intPARTID_1 = 0
> ├── reqPARTID_1_1 = 0
> ├── reqPARTID_1_2 = 0 + n
> ├── ...
> └── reqPARTID_1_m = 0 + n * (m - 1)
>
> intPARTID_2 = 1
> ├── reqPARTID_2_1 = 1
> ├── reqPARTID_2_2 = 1 + n
> ├── ...
> └── reqPARTID_2_m = 1 + n * (m - 1)
>
> ...
>
> intPARTID_n = (n - 1)
>
> Each intPARTID has m reqPARTIDs, which are used to expand the number of
> monitoring groups under the control group. Therefore, the number of
> monitoring groups is no longer limited by the range of MPAM PMG, which
> enhances the extensibility of the system's monitoring capabilities.
The idea of mapping multiple reqPARTIDs to each resctrl control group
looks like it can work, but I think that there are some issues that
need to be considered:
1) There may be a mixture of MSCs in the system, some of which support
PARTID Narrowing and some of which do not. Affected MSCs will not be
able to regulate resource consumption for a single resctrl control
group as a single unit, if multiple reqPARTIDs are used.
This matters when an MSC that does not support PARTID Narrowing also
has resource controls that are not of the "partition bitmap" type.
(Consider a resctrl control partition that throttles the partition to
30% of memory bandwidth. How can the same behaviour be achieved if the
partition is split arbitrarily across multiple reqPARTIDs?)
Because the MPAM driver needs to be as general as possible, it may be
hard to make the "right" decision about whether to group reqPARTIDs to
provide more monitoring groups. because different use cases may have
different requirments (e.g., number of control groups versus number of
monitoring groups, and which types of control are useful).
2) The resctrl core code uses CLOSIDs and RMIDs to identify control
groups and monitoring groups. If a particular driver wants to
translate these into other values (reqPARTID, intPARTID, PMG) then it
can do so, but this mapping logic should be encapsulated in the driver.
This should be better for maintainability, since the details of the
remapping will be arch-specific -- and in general not all arches are
going to require it. With this in mind, I think that changes in the
resctrl core code would be minimal (perhaps no changes at all).
3) How should the amount of reqPARTID grouping (your "n" parameter) be
determined, in general?
As with (1), the right answer may depend on the use case as well as on
the hardware.
From my investigations into this, I feel that some configuration
parameters will probably be needed, at least at boot time.
4) If the mapping between reqPARTIDs and (CLOSID,RMID) pairs is static,
is it necessary to track which reqPARTIDs are in use? Would it be
simpler to treat all n reqPARTIDs as permanently assigned to the
corresponding CLOSID?
If reqPARTID usage is not tracked, then every control change on MSCs
that do not support PARTID Narrowing would need to be replicated across
all reqPARTIDs corresponding to the affected resctrl control partition.
But control changes are a relatively rare event, so this approach feels
acceptable as a way of keeping the driver complexity down. It partly
depends on how large the "n" parameter can become.
(Since PARTID Narrowing allows any arbitrary set of reqPARTIDs to be
mapped to a given intPARTID, it might be possible to allocate
reqPARTIDs completely dynamically. But this probably would require a
change to the resctrl core interface. It is also not clear to me
whether the "num_closids" and "num_rmids" values advertised to
userspace can be satisfied. For now, static allocation seems the most
straightforward way to to get better monitoring, but perhaps it could
be enhanced later on.)
[...]
Cheers
---Dave
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver
2024-11-19 15:31 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Dave Martin
@ 2024-11-23 9:34 ` Zeng Heng
2024-11-25 15:39 ` Dave Martin
0 siblings, 1 reply; 11+ messages in thread
From: Zeng Heng @ 2024-11-23 9:34 UTC (permalink / raw)
To: Dave Martin
Cc: james.morse, linux-kernel, jonathan.cameron, xiexiuqi,
linux-arm-kernel, bobo.shaobowang
Hi,
Thanks for comments!
On 2024/11/19 23:31, Dave Martin wrote:
>
> 1) There may be a mixture of MSCs in the system, some of which support
> PARTID Narrowing and some of which do not. Affected MSCs will not be
> able to regulate resource consumption for a single resctrl control
> group as a single unit, if multiple reqPARTIDs are used.
>
> This matters when an MSC that does not support PARTID Narrowing also
> has resource controls that are not of the "partition bitmap" type.
>
> (Consider a resctrl control partition that throttles the partition to
> 30% of memory bandwidth. How can the same behaviour be achieved if the
> partition is split arbitrarily across multiple reqPARTIDs?)
>
> Because the MPAM driver needs to be as general as possible, it may be
> hard to make the "right" decision about whether to group reqPARTIDs to
> provide more monitoring groups. because different use cases may have
> different requirments (e.g., number of control groups versus number of
> monitoring groups, and which types of control are useful).
1. The patch set solution is designed considering mixed MSC scenarios.
Regarding the definition of the quantity 'n', here is a detailed
explanation:
n - Indicates the total number of intPARTIDs
l - Represents the total number of reqPARTIDs
m - Indicates the number of reqPARTIDs per intPARTID
The values of n/l/m are derived from the following formula:
n = min(intPARTID-np, PARTID-nnp)
l = min(reqPARTID-np, PARTID-nnp)
m = l // n
reqPARTID-np -- The number of reqPARTIDs supported by MSCs that support
narrow-partid.
intPARTID-np -- The number of intPARTIDs supported by MSCs that support
narrow partid.
PARTID-nnp -- The number of PARTIDs supported by MSCs that do not support
narrow partid.
The software needs to ensure that 'm' is an integer, meaning the number of
supported reqPARTIDs is an integer multiple of 'n'.
To illustrate how to determine n, l, and m through examples, we can assume
a specific platform:
L3 - Supports the narrow PARTID feature, supports 32 intPARTIDs, and
supports 256 reqPARTIDs.
mata - Does not support the narrow PARTID feature, supports a range of
256 PARTIDs.
Then,
n = min(intPARTID-l3, PARTID-mata) = min(32, 256) = 32
l = min(reqPARTID-l3, PARTID-mata) = min(256,256) = 256
m = 256 / 32 = 8
The mapping relationships between each group's closid and the respective
MSCs' intPARTID/reqPartid/PARTID are illustrated:
P - partition group
M - monitoring group
Group: Closid MSCs with narrow-partid MSCs without narrow-partid
P1 : 0 intPARTID_1 PARTID_1
M1_1 : 0 ├── reqPARTID_1_1 ├── PARTID_1_1
M1_2 : 0+n ├── reqPARTID_1_2 ├── PARTID_1_2
... ├── ... ├── ...
M1_m : 0+n*(m-1) └── reqPARTID_1_m └── PARTID_1_m
P2 : 1 intPARTID_2 PARTID_2
M2_1 : 1 ├── reqPARTID_2_1 ├── PARTID_2_1
M2_2 : 1+n ├── reqPARTID_2_2 ├── PARTID_2_2
... ├── ... ├── ...
M2_m : 1+n*(m-1) └── reqPARTID_2_m └── PARTID_2_m
Pn : (n-1) intPARTID_n PARTID_n
Mn_1 : (n-1) ├── reqPARTID_n_1 ├── PARTID_n_1
Mn_2 : (n-1)+n ├── reqPARTID_n_2 ├── PARTID_n_2
... ├── ... ├── ...
Mn_m : (n-1)+n*(m-1) = n*m-1 └── reqPARTID_n_m └── PARTID_n_m
The advantages of doing this are:
1. There is no need to modify or disrupt the existing resctrl layer
interface, ensuring that each control group has same resource
control functionality;
2. MSCs that support narrow-partid (including intPARTID and reqPARTID)
and MSCs that do not support (only including PARTID) can share the
same PARTID space;
3. On the premise of ensuring the (1) point, the number of control
groups can be maximized, because users can always choose to make a
control group act as a sub-monitoring group under another control
group;
> 2) The resctrl core code uses CLOSIDs and RMIDs to identify control
> groups and monitoring groups. If a particular driver wants to
> translate these into other values (reqPARTID, intPARTID, PMG) then it
> can do so, but this mapping logic should be encapsulated in the driver.
> This should be better for maintainability, since the details of the
> remapping will be arch-specific -- and in general not all arches are
> going to require it. With this in mind, I think that changes in the
> resctrl core code would be minimal (perhaps no changes at all).
Yes, maintaining the interface of the resctrl core code unchanged is,
in essence, the (first) important constraint of the current MPAM code.
We try the best to keep all resctrl interfaces and ensure the existing
functionality of x86 RDT.
The only thing that falls short of being ideal (forgive me), is that
it introduces the sole new function resctrl_arch_alloc_rmid() into the
resctrl code (resctrl_arch_free_rmid() will be optimized away in the next
version, and there are no other new functions any more).
The resctrl_arch_alloc_rmid() is the result of several restructuring
iterations and it is one of the most critical points in the patch series.
> 3) How should the amount of reqPARTID grouping (your "n" parameter) be
> determined, in general?
>
> As with (1), the right answer may depend on the use case as well as on
> the hardware.
>
> >From my investigations into this, I feel that some configuration
> parameters will probably be needed, at least at boot time.
As mentioned earlier,
Total number of intPARTIDs: n = min(intPARTID-np, PARTID-nnp)
Total number of reqPARTIDs: l = min(reqPARTID-np, PARTID-nnp)
We maximize the number of control groups because users can always
choose to make a control group act as a sub-monitoring group any time.
> 4) If the mapping between reqPARTIDs and (CLOSID,RMID) pairs is static,
> is it necessary to track which reqPARTIDs are in use? Would it be
> simpler to treat all n reqPARTIDs as permanently assigned to the
> corresponding CLOSID?
>
> If reqPARTID usage is not tracked, then every control change on MSCs
> that do not support PARTID Narrowing would need to be replicated across
> all reqPARTIDs corresponding to the affected resctrl control partition.
> But control changes are a relatively rare event, so this approach feels
> acceptable as a way of keeping the driver complexity down. It partly
> depends on how large the "n" parameter can become.
Yes, totally agree. I will try to remove the reqPARTID bitmap and
the resctrl_arch_free_rmid(). As mentioned, this will simplify the code
logic and reduce changes to the resctrl layer code.
Initially, to reduce the number of IPI interrupt, keep this resource
tracking until now, and I will prioritize optimization for the next
version.
(In fact, the initial version of the patch set was dynamically allocated,
and during the code restructuring process, it was inevitable to retain
some of the original ideas.)
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver
2024-11-23 9:34 ` Zeng Heng
@ 2024-11-25 15:39 ` Dave Martin
2024-12-02 2:12 ` Zeng Heng
0 siblings, 1 reply; 11+ messages in thread
From: Dave Martin @ 2024-11-25 15:39 UTC (permalink / raw)
To: Zeng Heng
Cc: james.morse, linux-kernel, jonathan.cameron, xiexiuqi,
linux-arm-kernel, bobo.shaobowang
Hi,
On Sat, Nov 23, 2024 at 05:34:08PM +0800, Zeng Heng wrote:
> Hi,
>
> Thanks for comments!
>
> On 2024/11/19 23:31, Dave Martin wrote:
> >
> > 1) There may be a mixture of MSCs in the system, some of which support
> > PARTID Narrowing and some of which do not. Affected MSCs will not be
> > able to regulate resource consumption for a single resctrl control
> > group as a single unit, if multiple reqPARTIDs are used.
> >
> > This matters when an MSC that does not support PARTID Narrowing also
> > has resource controls that are not of the "partition bitmap" type.
> >
> > (Consider a resctrl control partition that throttles the partition to
> > 30% of memory bandwidth. How can the same behaviour be achieved if the
> > partition is split arbitrarily across multiple reqPARTIDs?)
> >
> > Because the MPAM driver needs to be as general as possible, it may be
> > hard to make the "right" decision about whether to group reqPARTIDs to
> > provide more monitoring groups. because different use cases may have
> > different requirments (e.g., number of control groups versus number of
> > monitoring groups, and which types of control are useful).
>
> 1. The patch set solution is designed considering mixed MSC scenarios.
>
> Regarding the definition of the quantity 'n', here is a detailed
>
> explanation:
>
> n - Indicates the total number of intPARTIDs
>
> l - Represents the total number of reqPARTIDs
>
> m - Indicates the number of reqPARTIDs per intPARTID
>
> The values of n/l/m are derived from the following formula:
>
> n = min(intPARTID-np, PARTID-nnp)
>
> l = min(reqPARTID-np, PARTID-nnp)
>
> m = l // n
>
> reqPARTID-np -- The number of reqPARTIDs supported by MSCs that support
> narrow-partid.
>
> intPARTID-np -- The number of intPARTIDs supported by MSCs that support
> narrow partid.
> PARTID-nnp -- The number of PARTIDs supported by MSCs that do not support
> narrow partid.
>
> The software needs to ensure that 'm' is an integer, meaning the number of
>
> supported reqPARTIDs is an integer multiple of 'n'.
>
> To illustrate how to determine n, l, and m through examples, we can assume
>
> a specific platform:
>
> L3 - Supports the narrow PARTID feature, supports 32 intPARTIDs, and
>
> supports 256 reqPARTIDs.
>
> mata - Does not support the narrow PARTID feature, supports a range of
>
> 256 PARTIDs.
>
> Then,
>
> n = min(intPARTID-l3, PARTID-mata) = min(32, 256) = 32
>
> l = min(reqPARTID-l3, PARTID-mata) = min(256,256) = 256
>
> m = 256 / 32 = 8
>
> The mapping relationships between each group's closid and the respective
>
> MSCs' intPARTID/reqPartid/PARTID are illustrated:
>
> P - partition group
>
> M - monitoring group
>
> Group: Closid MSCs with narrow-partid MSCs without narrow-partid
> P1 : 0 intPARTID_1 PARTID_1
> M1_1 : 0 ├── reqPARTID_1_1 ├── PARTID_1_1
> M1_2 : 0+n ├── reqPARTID_1_2 ├── PARTID_1_2
> ... ├── ... ├── ...
> M1_m : 0+n*(m-1) └── reqPARTID_1_m └── PARTID_1_m
>
> P2 : 1 intPARTID_2 PARTID_2
> M2_1 : 1 ├── reqPARTID_2_1 ├── PARTID_2_1
> M2_2 : 1+n ├── reqPARTID_2_2 ├── PARTID_2_2
> ... ├── ... ├── ...
> M2_m : 1+n*(m-1) └── reqPARTID_2_m └── PARTID_2_m
>
> Pn : (n-1) intPARTID_n PARTID_n
> Mn_1 : (n-1) ├── reqPARTID_n_1 ├── PARTID_n_1
> Mn_2 : (n-1)+n ├── reqPARTID_n_2 ├── PARTID_n_2
> ... ├── ... ├── ...
> Mn_m : (n-1)+n*(m-1) = n*m-1 └── reqPARTID_n_m └── PARTID_n_m
Thanks for this illustration.
> The advantages of doing this are:
>
> 1. There is no need to modify or disrupt the existing resctrl layer
>
> interface, ensuring that each control group has same resource
>
> control functionality;
I don't think this is guaranteed.
If there is some MSC that does not have PARTID Narrowing support, and
this MSC has a memory bandwidth control that the MPAM driver exposes
through resctrl, then there is no way to configure that MSC that
exhibits the behaviour that the resctrl user expects.
For a concrete example:
Suppose that n=8, and the user asks for P1 to be given 30% of system
memory bandwidth.
On the affected MSC, P1 maps to eight PARTIDs, each with its own memory
bandwidth regulation.
If the work that happens to be in M1_1 dominates P1's bandwith
requirment, then PARTID_1_1 needs to be given 30% of total memory bandwidth.
If the work in P1 is evenly spread across M1_1, M1_2 ... M1_m, then
they would each need to be given 30% / 8 = 3.75% of total memory
bandwidth so that the total allocated bandwidth is 30%.
But we don't know how memory bandwidth consumption is distributed among
M1_1, M2_2 etc., so there is no way to program the memory bandiwdth
regulation on that MSC that guarantees the expected result of P1
receiving 30% of the total available bandwidth.
This means that on some hardware, a choice needs to be made: should the
MPAM driver hide from resctrl any controls that have this problem, or
should it disable the use of PARTID Narrowing for providing additional
monitoring groups.
My concern is that the correct choice is likely to be use-case-
dependent.
Do you have a view on this?
> 2. MSCs that support narrow-partid (including intPARTID and reqPARTID)
>
> and MSCs that do not support (only including PARTID) can share the
>
> same PARTID space;
This seems like it may be problematic on some hardware, as I tried to
explain above for point 1.
Note though, if the non-Narrowing MSCs only have bitmap-type controls,
then sharing the PARTID space is harmless. This comes about because
because these controls explicitly allow contention: cache way 0 for
example is contended between all the work that is allowed by MPAM to
use this cache way. Breaking up the work arbitrarily under different
PARTIDs makes no difference in this case: the amount of work allocated
to that cache way, and the amount of contention is still the same.
>
> 3. On the premise of ensuring the (1) point, the number of control
>
> groups can be maximized, because users can always choose to make a
>
> control group act as a sub-monitoring group under another control
>
> group;
What do you mean by "control group" here?
resctrl's group hierarchy is strict: work is distributed across one or
more control groups at the top level, and the work in each control
group is further distributed across one or more monitoring groups
within that control group.
There is no way to repurpose a resctrl control group is a monitoring
group under some other control group.
Or were you referring to something else here?
> > 2) The resctrl core code uses CLOSIDs and RMIDs to identify control
> > groups and monitoring groups. If a particular driver wants to
> > translate these into other values (reqPARTID, intPARTID, PMG) then it
> > can do so, but this mapping logic should be encapsulated in the driver.
> > This should be better for maintainability, since the details of the
> > remapping will be arch-specific -- and in general not all arches are
> > going to require it. With this in mind, I think that changes in the
> > resctrl core code would be minimal (perhaps no changes at all).
> Yes, maintaining the interface of the resctrl core code unchanged is,
> in essence, the (first) important constraint of the current MPAM code.
> We try the best to keep all resctrl interfaces and ensure the existing
> functionality of x86 RDT.
>
> The only thing that falls short of being ideal (forgive me), is that
> it introduces the sole new function resctrl_arch_alloc_rmid() into the
> resctrl code (resctrl_arch_free_rmid() will be optimized away in the next
> version, and there are no other new functions any more).
>
> The resctrl_arch_alloc_rmid() is the result of several restructuring
> iterations and it is one of the most critical points in the patch series.
I was concerned about the changes in patch 6 for example, where the new
function task_belongs_to_ctrl_group() now has to look at more
information that just rdtgroup->closid, in order to determine which
control group a task belongs to. This is precisely what
resctrl_arch_match_closid() is supposed to do, using just the closid.
This suggests that the meaning of "closid" in the core code has been
changed: if closid is the control group identifier, then each control
group should have exactly one closid value.
For comparison, you may want to take a look at the top 3 commits of
this experimental branch:
https://git.gitlab.arm.com/linux-arm/linux-dm/-/commits/mpam/partid-pmg-remap/v0.2/head/?ref_type=heads
which attempts to do all the mapping within the MPAM driver instead.
Note, the approach is a bit over-complicated and I decided that a
simpler approach is needed. But it may help to illustrate what I mean
about keeping all the remapping out of the resctrl core code.
>
> > 3) How should the amount of reqPARTID grouping (your "n" parameter) be
> > determined, in general?
> >
> > As with (1), the right answer may depend on the use case as well as on
> > the hardware.
> >
> > >From my investigations into this, I feel that some configuration
> > parameters will probably be needed, at least at boot time.
> As mentioned earlier,
> Total number of intPARTIDs: n = min(intPARTID-np, PARTID-nnp)
> Total number of reqPARTIDs: l = min(reqPARTID-np, PARTID-nnp)
>
> We maximize the number of control groups because users can always
> choose to make a control group act as a sub-monitoring group any time.
I'm still not sure what you mean here; see my response on point 3.
(I might be misunderstanding something here, but if you can give an
illustration then that may help.)
>
> > 4) If the mapping between reqPARTIDs and (CLOSID,RMID) pairs is static,
> > is it necessary to track which reqPARTIDs are in use? Would it be
> > simpler to treat all n reqPARTIDs as permanently assigned to the
> > corresponding CLOSID?
> >
> > If reqPARTID usage is not tracked, then every control change on MSCs
> > that do not support PARTID Narrowing would need to be replicated across
> > all reqPARTIDs corresponding to the affected resctrl control partition.
> > But control changes are a relatively rare event, so this approach feels
> > acceptable as a way of keeping the driver complexity down. It partly
> > depends on how large the "n" parameter can become.
> Yes, totally agree. I will try to remove the reqPARTID bitmap and
> the resctrl_arch_free_rmid(). As mentioned, this will simplify the code
> logic and reduce changes to the resctrl layer code.
>
> Initially, to reduce the number of IPI interrupt, keep this resource
> tracking until now, and I will prioritize optimization for the next
> version.
> (In fact, the initial version of the patch set was dynamically allocated,
> and during the code restructuring process, it was inevitable to retain
> some of the original ideas.)
>
> Best regards,
> Zeng Heng
>
OK; fair enough.
This kind of feature could always be re-added later on if it proves to
be important for performance in real use-cases, but it is probably best
to keep things as simple as possible initially.
Cheers
---Dave
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver
2024-11-25 15:39 ` Dave Martin
@ 2024-12-02 2:12 ` Zeng Heng
0 siblings, 0 replies; 11+ messages in thread
From: Zeng Heng @ 2024-12-02 2:12 UTC (permalink / raw)
To: Dave Martin
Cc: james.morse, linux-kernel, jonathan.cameron, xiexiuqi,
linux-arm-kernel, bobo.shaobowang
On 2024/11/25 23:39, Dave Martin wrote:
>> The advantages of doing this are:
>>
>> 1. There is no need to modify or disrupt the existing resctrl layer
>>
>> interface, ensuring that each control group has same resource
>>
>> control functionality;
>
> I don't think this is guaranteed.
>
> If there is some MSC that does not have PARTID Narrowing support, and
> this MSC has a memory bandwidth control that the MPAM driver exposes
> through resctrl, then there is no way to configure that MSC that
> exhibits the behaviour that the resctrl user expects.
>
> For a concrete example:
>
> Suppose that n=8, and the user asks for P1 to be given 30% of system
> memory bandwidth.
>
> On the affected MSC, P1 maps to eight PARTIDs, each with its own memory
> bandwidth regulation.
>
> If the work that happens to be in M1_1 dominates P1's bandwith
> requirment, then PARTID_1_1 needs to be given 30% of total memory bandwidth.
>
> If the work in P1 is evenly spread across M1_1, M1_2 ... M1_m, then
> they would each need to be given 30% / 8 = 3.75% of total memory
> bandwidth so that the total allocated bandwidth is 30%.
>
> But we don't know how memory bandwidth consumption is distributed among
> M1_1, M2_2 etc., so there is no way to program the memory bandiwdth
> regulation on that MSC that guarantees the expected result of P1
> receiving 30% of the total available bandwidth.
>
>
> This means that on some hardware, a choice needs to be made: should the
> MPAM driver hide from resctrl any controls that have this problem, or
> should it disable the use of PARTID Narrowing for providing additional
> monitoring groups.
>
> My concern is that the correct choice is likely to be use-case-
> dependent.
>
> Do you have a view on this?
I understand your meaning and concerns, and this is indeed a problem.
From a software perspective, I think the use cases should be limited.
For scenarios where mata does not support narrow-partid, I tend to favor
disabling the narrow-partid feature in the driver for such scenarios.
From a hardware perspective, MSCs, such as L2/L3, are designed with area
considerations in mind and choose to implement the narrow-partid feature.
MATA, on the other hand, is located on a different die and does not have
similar concerns, often not considering the implementation of the
narrow-partid feature, which makes this a rather thorny issue.
>> 2. MSCs that support narrow-partid (including intPARTID and reqPARTID)
>>
>> and MSCs that do not support (only including PARTID) can share the
>>
>> same PARTID space;
>
> This seems like it may be problematic on some hardware, as I tried to
> explain above for point 1.
>
> Note though, if the non-Narrowing MSCs only have bitmap-type controls,
> then sharing the PARTID space is harmless. This comes about because
> because these controls explicitly allow contention: cache way 0 for
> example is contended between all the work that is allowed by MPAM to
> use this cache way. Breaking up the work arbitrarily under different
> PARTIDs makes no difference in this case: the amount of work allocated
> to that cache way, and the amount of contention is still the same.
>
Completely agree. MSCs without the narrow-partid feature, if they only
have bitmap-type controls, can be compatible with the shared PARTID
space scheme.
>>
>> 3. On the premise of ensuring the (1) point, the number of control
>>
>> groups can be maximized, because users can always choose to make a
>>
>> control group act as a sub-monitoring group under another control
>>
>> group;
>
> What do you mean by "control group" here?
>
> resctrl's group hierarchy is strict: work is distributed across one or
> more control groups at the top level, and the work in each control
> group is further distributed across one or more monitoring groups
> within that control group.
>
> There is no way to repurpose a resctrl control group is a monitoring
> group under some other control group.
>
> Or were you referring to something else here?
>
Apologies for my miscommunication.
What I meant to say is to use the extra PARTIDs of MSC (which do not support
the narrow-partid feature) as an expansion for number of sub-monitoring
groups.
>>> 2) The resctrl core code uses CLOSIDs and RMIDs to identify control
>>> groups and monitoring groups. If a particular driver wants to
>>> translate these into other values (reqPARTID, intPARTID, PMG) then it
>>> can do so, but this mapping logic should be encapsulated in the driver.
>>> This should be better for maintainability, since the details of the
>>> remapping will be arch-specific -- and in general not all arches are
>>> going to require it. With this in mind, I think that changes in the
>>> resctrl core code would be minimal (perhaps no changes at all).
>
>> Yes, maintaining the interface of the resctrl core code unchanged is,
>> in essence, the (first) important constraint of the current MPAM code.
>> We try the best to keep all resctrl interfaces and ensure the existing
>> functionality of x86 RDT.
>>
>> The only thing that falls short of being ideal (forgive me), is that
>> it introduces the sole new function resctrl_arch_alloc_rmid() into the
>> resctrl code (resctrl_arch_free_rmid() will be optimized away in the next
>> version, and there are no other new functions any more).
>>
>> The resctrl_arch_alloc_rmid() is the result of several restructuring
>> iterations and it is one of the most critical points in the patch series.
>
> I was concerned about the changes in patch 6 for example, where the new
> function task_belongs_to_ctrl_group() now has to look at more
> information that just rdtgroup->closid, in order to determine which
> control group a task belongs to. This is precisely what
> resctrl_arch_match_closid() is supposed to do, using just the closid.
>
> This suggests that the meaning of "closid" in the core code has been
> changed: if closid is the control group identifier, then each control
> group should have exactly one closid value.
>
>
> For comparison, you may want to take a look at the top 3 commits of
> this experimental branch:
>
> https://git.gitlab.arm.com/linux-arm/linux-dm/-/commits/mpam/partid-pmg-remap/v0.2/head/?ref_type=heads
>
> which attempts to do all the mapping within the MPAM driver instead.
> Note, the approach is a bit over-complicated and I decided that a
> simpler approach is needed. But it may help to illustrate what I mean
> about keeping all the remapping out of the resctrl core code.
>
>
I understand your suggestion. I will consider refactoring the mapping
relationships between closid/rmid and partid/reqpartid/intpartid/pmg.
In fact, I prepared a simplified version of v2 as v3. But in light of
your suggestions, I decide to reconstruct the solution. At present, I'm
not sure if I can completely isolated the mapping within the MPAM driver
layer only. If my reconstructed version goes smoothly, I will reply ASAP.
>>> 4) If the mapping between reqPARTIDs and (CLOSID,RMID) pairs is static,
>>> is it necessary to track which reqPARTIDs are in use? Would it be
>>> simpler to treat all n reqPARTIDs as permanently assigned to the
>>> corresponding CLOSID?
>>>
>>> If reqPARTID usage is not tracked, then every control change on MSCs
>>> that do not support PARTID Narrowing would need to be replicated across
>>> all reqPARTIDs corresponding to the affected resctrl control partition.
>>> But control changes are a relatively rare event, so this approach feels
>>> acceptable as a way of keeping the driver complexity down. It partly
>>> depends on how large the "n" parameter can become.
>> Yes, totally agree. I will try to remove the reqPARTID bitmap and
>> the resctrl_arch_free_rmid(). As mentioned, this will simplify the code
>> logic and reduce changes to the resctrl layer code.
>>
>> Initially, to reduce the number of IPI interrupt, keep this resource
>> tracking until now, and I will prioritize optimization for the next
>> version.
>> (In fact, the initial version of the patch set was dynamically allocated,
>> and during the code restructuring process, it was inevitable to retain
>> some of the original ideas.)
>>
>> Best regards,
>> Zeng Heng
>>
>
> OK; fair enough.
>
> This kind of feature could always be re-added later on if it proves to
> be important for performance in real use-cases, but it is probably best
> to keep things as simple as possible initially.
>
Many thanks as always for your prompt reply and insightful suggestions.
Best Regards,
Zeng Heng
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-12-02 2:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19 13:50 [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Zeng Heng
2024-11-19 13:50 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 1/6] arm_mpam: Introduce the definitions of intPARTID and reqPARTID Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 2/6] arm_mpam: Create reqPARTIDs resource bitmap Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 3/6] arm_mpam: Enhance the rmid allocation strategy Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 4/6] arm_mpam: Set INTERNAL as needed when setting MSC controls Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 5/6] arm_mpam: Call resctrl_sync_config() when allocate new reqPARTID Zeng Heng
2024-11-19 13:51 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 6/6] fs/resctrl: Add the helper to check if the task exists in the target group Zeng Heng
2024-11-19 15:31 ` [RFC PATCH mpam mpam/snapshot/v6.12-rc1 v2 0/6] arm_mpam: Introduce the Narrow-PARTID feature for MPAM driver Dave Martin
2024-11-23 9:34 ` Zeng Heng
2024-11-25 15:39 ` Dave Martin
2024-12-02 2:12 ` Zeng Heng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox