* [PATCH v3 0/3] x86,fs/resctrl,arm_mpam: Factor MBA parse-time conversion to be per-arch
@ 2026-05-15 14:06 Ben Horgan
2026-05-15 14:06 ` [PATCH v3 1/3] x86/resctrl: Add resctrl_arch_preconvert_bw() Ben Horgan
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Ben Horgan @ 2026-05-15 14:06 UTC (permalink / raw)
To: ben.horgan
Cc: james.morse, reinette.chatre, fenghuay, linux-kernel,
linux-arm-kernel, tglx, mingo, bp, dave.hansen, hpa, corbet, x86,
linux-doc, dave.martin
This is a new version of Dave Martin's patch [1] to delegate rounding of
bandwidth control user values to the arch code. As there is now more than one
architecture using resctrl, I split the original patch into two, a core resctrl
patch and an x86 patch, and added an MPAM patch. Please let me know if the patch
break down and ordering is sensible and whether the pattern should be followed
for any future similar changes.
This does have a user visible effect on MB schema when using MPAM hardware
with 'bandwidth_gran' greater than 1. I'm not sure if MPAM hardware with such
coarse controls exists in the wild but it is spec compliant and I've tested it
on a model.
[1] https://lore.kernel.org/lkml/20251031154225.14799-1-Dave.Martin@arm.com/
Ben Horgan (2):
x86/resctrl: Add resctrl_arch_preconvert_bw()
arm_mpam: resctrl: Add pass-through resctrl_arch_preconvert_bw()
Dave Martin (1):
fs/resctrl: Factor MBA parse-time conversion to be per-arch
Documentation/filesystems/resctrl.rst | 17 +++++++++--------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 6 ++++++
drivers/resctrl/mpam_resctrl.c | 5 +++++
fs/resctrl/ctrlmondata.c | 6 +++---
include/linux/resctrl.h | 19 +++++++++++++++++++
5 files changed, 42 insertions(+), 11 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/3] x86/resctrl: Add resctrl_arch_preconvert_bw()
2026-05-15 14:06 [PATCH v3 0/3] x86,fs/resctrl,arm_mpam: Factor MBA parse-time conversion to be per-arch Ben Horgan
@ 2026-05-15 14:06 ` Ben Horgan
2026-05-15 14:06 ` [PATCH v3 2/3] arm_mpam: resctrl: Add pass-through resctrl_arch_preconvert_bw() Ben Horgan
2026-05-15 14:06 ` [PATCH v3 3/3] fs/resctrl: Factor MBA parse-time conversion to be per-arch Ben Horgan
2 siblings, 0 replies; 4+ messages in thread
From: Ben Horgan @ 2026-05-15 14:06 UTC (permalink / raw)
To: ben.horgan
Cc: james.morse, reinette.chatre, fenghuay, linux-kernel,
linux-arm-kernel, tglx, mingo, bp, dave.hansen, hpa, corbet, x86,
linux-doc, dave.martin
On MPAM systems the rounding behaviour of the MBA control would be improved
if the rounding in the fs/resctrl code is removed but this is not the
case for x86. To allow any rounding or conversion of the bandwidth value
provided by the user to be specified by the arch code a new arch hook is
required.
Introduce resctrl_arch_preconvert_bw(), and add its x86 implementation.
This is currently unused in resctrl but when plumbed in it will replace the
call to roundup() in bw_validate().
Signed-off-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since Dave's v2:
Split from larger patch and add commit message
Update kernel-doc (Reinette)
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 6 ++++++
include/linux/resctrl.h | 19 +++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index b20e705606b8..19ae596f6b30 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -16,9 +16,15 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/cpu.h>
+#include <linux/math.h>
#include "internal.h"
+u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r)
+{
+ return roundup(val, (unsigned long)r->membw.bw_gran);
+}
+
int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type t, u32 cfg_val)
{
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 006e57fd7ca5..33a6742da4f9 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -500,6 +500,25 @@ bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
*/
int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
+/**
+ * resctrl_arch_preconvert_bw() - Prepare bandwidth control value for arch use.
+ * @val: Bandwidth control value written to the schemata file by userspace.
+ * @r: Resource whose schema was written.
+ *
+ * Convert the user provided bandwidth control value to an appropriate form for
+ * consumption by the hardware driver for resource @r. Converted value is stored
+ * in rdt_ctrl_domain::staged_config[] for later consumption by
+ * resctrl_arch_update_domains(). Is not called when MBA software controller is
+ * enabled.
+ *
+ * Architectures for which this pre-conversion hook is not useful should supply
+ * an implementation of this function that just returns val unmodified.
+ *
+ * Return:
+ * The converted value.
+ */
+u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r);
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 2/3] arm_mpam: resctrl: Add pass-through resctrl_arch_preconvert_bw()
2026-05-15 14:06 [PATCH v3 0/3] x86,fs/resctrl,arm_mpam: Factor MBA parse-time conversion to be per-arch Ben Horgan
2026-05-15 14:06 ` [PATCH v3 1/3] x86/resctrl: Add resctrl_arch_preconvert_bw() Ben Horgan
@ 2026-05-15 14:06 ` Ben Horgan
2026-05-15 14:06 ` [PATCH v3 3/3] fs/resctrl: Factor MBA parse-time conversion to be per-arch Ben Horgan
2 siblings, 0 replies; 4+ messages in thread
From: Ben Horgan @ 2026-05-15 14:06 UTC (permalink / raw)
To: ben.horgan
Cc: james.morse, reinette.chatre, fenghuay, linux-kernel,
linux-arm-kernel, tglx, mingo, bp, dave.hansen, hpa, corbet, x86,
linux-doc, dave.martin
resctrl rounds up the percentage value of the MBA based on the bw_gran. As
MPAM uses a binary fixed point fraction format for MBA rather than a
decimal percentage, this introduces rounding errors.
Without this additional rounding, if the user reads the value in an MB
schema and then writes it back to the schema, the value in hardware won't
change. However, with this additional rounding, this guarantee is broken
for systems with mbw_wd < 7.
resctrl is introducing resctrl_arch_preconvert_bw() to allow the arch code
to specify the conversion resctrl does to the user-provided bandwidth
value. Add the MPAM version of resctrl_arch_preconvert_bw(). This does no
conversion.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/mpam_resctrl.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 226ff6f532fa..5a2104af22cc 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -167,6 +167,11 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
return mpam_resctrl_controls[rid].cdp_enabled;
}
+u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r)
+{
+ return val;
+}
+
/**
* resctrl_reset_task_closids() - Reset the PARTID/PMG values for all tasks.
*
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 3/3] fs/resctrl: Factor MBA parse-time conversion to be per-arch
2026-05-15 14:06 [PATCH v3 0/3] x86,fs/resctrl,arm_mpam: Factor MBA parse-time conversion to be per-arch Ben Horgan
2026-05-15 14:06 ` [PATCH v3 1/3] x86/resctrl: Add resctrl_arch_preconvert_bw() Ben Horgan
2026-05-15 14:06 ` [PATCH v3 2/3] arm_mpam: resctrl: Add pass-through resctrl_arch_preconvert_bw() Ben Horgan
@ 2026-05-15 14:06 ` Ben Horgan
2 siblings, 0 replies; 4+ messages in thread
From: Ben Horgan @ 2026-05-15 14:06 UTC (permalink / raw)
To: ben.horgan
Cc: james.morse, reinette.chatre, fenghuay, linux-kernel,
linux-arm-kernel, tglx, mingo, bp, dave.hansen, hpa, corbet, x86,
linux-doc, dave.martin, Dave Martin, Ben Horgan
From: Dave Martin <Dave.Martin@arm.com>
The control value parser for the MB resource currently coerces the
memory bandwidth percentage value from userspace to be an exact
multiple of the rdt_resource::resctrl_membw::bw_gran parameter.
On MPAM systems, this results in somewhat worse-than-worst-case
rounding, since the bandwidth granularity advertised to resctrl by the
MPAM driver is in general only an approximation to the actual hardware
granularity on these systems, and the hardware bandwidth allocation
control value is not natively a percentage -- necessitating a further
conversion in the resctrl_arch_update_domains() path, regardless of the
conversion done at parse time.
For MPAM and x86 use their custom pre-prepared parse-time conversion,
resctrl_arch_preconvert_bw(). This will avoid accumulated error
from rounding the value twice on MPAM systems. For x86 systems there
is no functional change.
Clarify the documentation, but avoid overly exact promises.
Clamping to bw_min and bw_max still feels generic: leave it in the core
code, for now.
[ BH: Split out x86 specific changes ]
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Ben Horgan <Ben.Horgan@arm.com>
Reviewed-by: Ben Horgan <ben.horgan@arm.com>
---
Documentation/filesystems/resctrl.rst | 17 +++++++++--------
fs/resctrl/ctrlmondata.c | 6 +++---
2 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index b003bed339fd..4322d8025453 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -236,12 +236,11 @@ with respect to allocation:
user can request.
"bandwidth_gran":
- The granularity in which the memory bandwidth
- percentage is allocated. The allocated
- b/w percentage is rounded off to the next
- control step available on the hardware. The
- available bandwidth control steps are:
- min_bandwidth + N * bandwidth_gran.
+ The approximate granularity in which the memory bandwidth
+ percentage is allocated. The allocated bandwidth percentage
+ is rounded up to the next control step available on the
+ hardware. The available hardware steps are no larger than
+ this value.
"delay_linear":
Indicates if the delay scale is linear or
@@ -871,8 +870,10 @@ The minimum bandwidth percentage value for each cpu model is predefined
and can be looked up through "info/MB/min_bandwidth". The bandwidth
granularity that is allocated is also dependent on the cpu model and can
be looked up at "info/MB/bandwidth_gran". The available bandwidth
-control steps are: min_bw + N * bw_gran. Intermediate values are rounded
-to the next control step available on the hardware.
+control steps are, approximately, min_bw + N * bw_gran. The steps may
+appear irregular due to rounding to an exact percentage: bw_gran is the
+maximum interval between the percentage values corresponding to any two
+adjacent steps in the hardware.
The bandwidth throttling is a core specific mechanism on some of Intel
SKUs. Using a high bandwidth and a low bandwidth setting on two threads
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index 9a7dfc48cb2e..934e12f5d145 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -37,8 +37,8 @@ typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
/*
* Check whether MBA bandwidth percentage value is correct. The value is
* checked against the minimum and max bandwidth values specified by the
- * hardware. The allocated bandwidth percentage is rounded to the next
- * control step available on the hardware.
+ * hardware. The allocated bandwidth percentage is converted as
+ * appropriate for consumption by the specific hardware driver.
*/
static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
{
@@ -71,7 +71,7 @@ static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
return false;
}
- *data = roundup(bw, (unsigned long)r->membw.bw_gran);
+ *data = resctrl_arch_preconvert_bw(bw, r);
return true;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-15 14:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 14:06 [PATCH v3 0/3] x86,fs/resctrl,arm_mpam: Factor MBA parse-time conversion to be per-arch Ben Horgan
2026-05-15 14:06 ` [PATCH v3 1/3] x86/resctrl: Add resctrl_arch_preconvert_bw() Ben Horgan
2026-05-15 14:06 ` [PATCH v3 2/3] arm_mpam: resctrl: Add pass-through resctrl_arch_preconvert_bw() Ben Horgan
2026-05-15 14:06 ` [PATCH v3 3/3] fs/resctrl: Factor MBA parse-time conversion to be per-arch Ben Horgan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox