From: Reinette Chatre <reinette.chatre@intel.com>
To: Dave Martin <Dave.Martin@arm.com>, <linux-kernel@vger.kernel.org>
Cc: Tony Luck <tony.luck@intel.com>,
James Morse <james.morse@arm.com>,
"Ben Horgan" <ben.horgan@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
"Jonathan Corbet" <corbet@lwn.net>, <x86@kernel.org>,
<linux-doc@vger.kernel.org>
Subject: Re: [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to be per-arch
Date: Thu, 20 Nov 2025 08:46:24 -0800 [thread overview]
Message-ID: <45b96e99-ac5e-4546-b58b-f4d062d2f823@intel.com> (raw)
In-Reply-To: <20251031154225.14799-1-Dave.Martin@arm.com>
Hi Dave,
On 10/31/25 8:41 AM, Dave Martin wrote:
> The control value parser for the MB resource currently coerces the
> memory bandwidth percentage value from userspace to be an exact
> multiple of the rdt_resource::resctrl_membw::bw_gran parameter.
>
> On MPAM systems, this results in somewhat worse-than-worst-case
> rounding, since the bandwidth granularity advertised to resctrl by the
> MPAM driver is in general only an approximation to the actual hardware
> granularity on these systems, and the hardware bandwidth allocation
> control value is not natively a percentage -- necessitating a further
> conversion in the resctrl_arch_update_domains() path, regardless of the
> conversion done at parse time.
>
> Allow the arch to provide its own parse-time conversion that is
> appropriate for the hardware, and move the existing conversion to x86.
> This will avoid accumulated error from rounding the value twice on MPAM
> systems.
>
> Clarify the documentation, but avoid overly exact promises.
>
> Clamping to bw_min and bw_max still feels generic: leave it in the core
> code, for now.
>
> No functional change.
Please use max line length available. Changelogs have to conform before merge
and having patch ready saves this work.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>
> ---
...
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index b7f35b07876a..fbbcf5421346 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -144,12 +144,11 @@ with respect to allocation:
> user can request.
>
> "bandwidth_gran":
> - The granularity in which the memory bandwidth
> - percentage is allocated. The allocated
> - b/w percentage is rounded off to the next
> - control step available on the hardware. The
> - available bandwidth control steps are:
> - min_bandwidth + N * bandwidth_gran.
> + The approximate granularity in which the memory bandwidth
> + percentage is allocated. The allocated bandwidth percentage
> + is rounded up to the next control step available on the
> + hardware. The available hardware steps are no larger than
> + this value.
>
> "delay_linear":
> Indicates if the delay scale is linear or
> @@ -737,8 +736,10 @@ The minimum bandwidth percentage value for each cpu model is predefined
> and can be looked up through "info/MB/min_bandwidth". The bandwidth
> granularity that is allocated is also dependent on the cpu model and can
> be looked up at "info/MB/bandwidth_gran". The available bandwidth
> -control steps are: min_bw + N * bw_gran. Intermediate values are rounded
> -to the next control step available on the hardware.
> +control steps are, approximately, min_bw + N * bw_gran. The steps may
> +appear irregular due to rounding to an exact percentage: bw_gran is the
> +maximum interval between the percentage values corresponding to any two
> +adjacent steps in the hardware.
>
> The bandwidth throttling is a core specific mechanism on some of Intel
> SKUs. Using a high bandwidth and a low bandwidth setting on two threads
The documentation changes look good to me. Thank you.
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 1189c0df4ad7..dc05a50e80fb 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -16,9 +16,15 @@
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
> #include <linux/cpu.h>
> +#include <linux/math.h>
>
> #include "internal.h"
>
> +u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r)
> +{
> + return roundup(val, (unsigned long)r->membw.bw_gran);
> +}
> +
> int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> u32 closid, enum resctrl_conf_type t, u32 cfg_val)
> {
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index 0d0ef54fc4de..26e3f7c88c86 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -35,8 +35,8 @@ typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
> /*
> * Check whether MBA bandwidth percentage value is correct. The value is
> * checked against the minimum and max bandwidth values specified by the
> - * hardware. The allocated bandwidth percentage is rounded to the next
> - * control step available on the hardware.
> + * hardware. The allocated bandwidth percentage is converted as
> + * appropriate for consumption by the specific hardware driver.
The text looks good but adjusting the right margin mid-paragraph seems unnecessary?
> */
> static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
> {
> @@ -69,7 +69,7 @@ static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
> return false;
> }
>
> - *data = roundup(bw, (unsigned long)r->membw.bw_gran);
> + *data = resctrl_arch_preconvert_bw(bw, r);
> return true;
> }
>
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index a7d92718b653..1fb6e2118b61 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -485,6 +485,20 @@ bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
> */
> int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
>
> +/*
> + * Convert a bandwidth control value to the appropriate form for
> + * consumption by the hardware driver for resource r.
When being as descriptive, please switch to proper kernel-doc instead. There are
a couple of examples for reference in this header file.
> + *
> + * For example, it simplifies the x86 RDT implementation to round the
> + * value to a suitable step here and then treat the resulting value as
> + * opaque when programming the hardware MSRs later on.
This "For example" can be dropped.
> + *
> + * Architectures for which this pre-conversion hook is not useful
> + * should supply an implementation of this function that just returns
> + * val unmodified.
> + */
> +u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r);
> +
> /*
> * Update the ctrl_val and apply this config right now.
> * Must be called on one of the domain's CPUs.
>
> base-commit: dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa
Reinette
next prev parent reply other threads:[~2025-11-20 16:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-31 15:41 [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to be per-arch Dave Martin
2025-11-10 11:06 ` Ben Horgan
2025-11-20 15:30 ` Dave Martin
2025-11-14 22:17 ` Reinette Chatre
2025-11-18 15:47 ` Dave Martin
2025-11-20 3:05 ` Reinette Chatre
2025-11-20 17:04 ` Dave Martin
2025-11-20 21:55 ` Reinette Chatre
2025-11-20 16:46 ` Reinette Chatre [this message]
2025-11-20 17:42 ` Dave Martin
2025-11-20 22:01 ` Reinette Chatre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45b96e99-ac5e-4546-b58b-f4d062d2f823@intel.com \
--to=reinette.chatre@intel.com \
--cc=Dave.Martin@arm.com \
--cc=ben.horgan@arm.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).