From: Drew Fustini <fustini@kernel.org>
To: Ben Horgan <ben.horgan@arm.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>,
Tony Luck <tony.luck@intel.com>,
James Morse <james.morse@arm.com>,
Dave Martin <Dave.Martin@arm.com>,
Babu Moger <babu.moger@amd.com>, Fenghua Yu <fenghuay@nvidia.com>,
Chen Yu <yu.c.chen@intel.com>, Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Newman <peternewman@google.com>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] mpam,x86,fs/resctrl: Generic schema description Proof of Concept
Date: Wed, 3 Jun 2026 12:34:05 -0700 [thread overview]
Message-ID: <aiCBratZchVFVhws@gen8> (raw)
In-Reply-To: <29c95b69-e1a4-46b1-ab8b-45c09308b924@arm.com>
On Wed, Jun 03, 2026 at 04:15:51PM +0100, Ben Horgan wrote:
> Hi Reinette,
>
> On 5/29/26 19:06, Reinette Chatre wrote:
> > Hi Everybody,
> >
> > It has been a while since we discussed the resctrl changes required to support
> > hardware that has controls with fine granularity or hardware that has multiple
> > controls per resource. For reference, the most recent email discussion can
> > be found at [1] with a summary of discussions in last year's plumbers slides [2].
> >
> > I created a PoC that I believe supports what folks have agreed to so far. I
> > hope this can help us to restart the discussion with the goal that resctrl gains
> > support for upcoming hardware that require these features.
>
> Thank you very much for doing this work. I believe this will be very useful for
> MPAM and other architectures.
Yes, thanks to Reinette for working on the generic schema proof of
concept. This will be helpful for supporting the RISC-V CBQRI (capacity
and bandwidth QoS) spec.
> I plumbed in support for the MB_MIN resource schema which also works under light
> testing. The only fs resctrl code change I needed was:
>
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -483,6 +483,9 @@ static inline u32 resctrl_get_default_ctrlval(struct
> resctrl_ctrl *ctrl)
> case RESCTRL_CTRL_BITMAP:
> return BIT_MASK(ctrl->cache.cbm_len) - 1;
> case RESCTRL_CTRL_SCALAR:
> + if (ctrl->name == RESCTRL_CTRL_NAME_MIN)
> + return ctrl->membw.min_bw;
> +
> return ctrl->membw.max_bw;
> }
>
>
> At least on MPAM systems, we use a default of 0 for minimum bandwidth controls
> as the maximum bandwidth controls only take effect if their value is higher than
> the minimum bandwidth value. I have specialised this on the ctrl->name which
> breaks your ctrl->type based classification but that's fixable by just adding a
> default field to membw.
This should be useful for RISC-V.
RESCTRL_CTRL_NAME_MIN maps well to CBQRI Rbwb (reserved bandwidth
blocks). The sum of Rbwb across all control groups must be less than
MRBWB (maximum number of reserved bandwidth blocks). As a result, MB_MIN
needs to default to 1 so that the sum does not violate that rule. In my
RFC series, I added default_to_min to resctrl_membw [1] but this
solution looks cleaner.
> > - No support for "read-modify-write" usage of schemata file. This is where we
> > discussed (without agreement) on possibly introducing the "#" prefix to schemata
> > file entries. This PoC does not support this prefix and the current assumption/expectation
> > is that when user space changes a configuration only the new control values are
> > written to schemata file. I thus do not have a plan to support this so please
> > share opinions in this regard if you have some.
>
> There is now less motivation from the MPAM side for this than when this was
> initially discussed. In pre-upstream versions of the MPAM patches a change in
> the MB resource control value would change both the mpam h/w mbw_min and mbw_max
> values but now (on non-broken h/w) we just change the mbw_max. (mbw_min kept at 0).
>
> However, it would be useful not to be limited by percentages. In my quick
> experimentation with your patches I used a percentage value for MB_MIN but it
> would be best to move away from this. For new controls I think we can mandate
> that user space has to discover the resolution from the info directly but how
> can we retrofit this. For MPAM, MB and MB_MAX, would control the same things.
> Could we just add MB_MAX with a h/w friendly scale and then reflect changes in
> MB_MAX in MB and vica versa with MB taking precedent if both are set? Old
> software can continue setting MB can move to using MB_MAX and take advantage of
> the improved control. (I don't think we should expose the MPAM hardware value
> directly as it has confusion over whether all 1s is 100% or not and we'd like to
> have something generic and friendly to the user.)
The facility for non-percentage value is import for RISC-V as CBQRI does
not include percentage throttle. It has two controls for bandwidth:
- Rbwb: number of reserved bandwidth blocks [1, 2^13]
- Mweight: weighted share of the remaining bandwidth [0, 255]
- 0: disables work-conserving sharing
- 1..255: compete for the leftover pool
- It makes for it to default to max (255) so that there won't be
any unused bandwidth
I think Mweight could be aligned with MPAM's proportional stride.
Here is the patch I created to add Mweight support:
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index d95ab8ad36e2..3537071e3ab0 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -304,6 +304,7 @@ static const char * const resctrl_ctrl_name[] = {
[RESCTRL_CTRL_NAME_DEF] = "",
[RESCTRL_CTRL_NAME_MIN] = "MIN",
[RESCTRL_CTRL_NAME_MAX] = "MAX",
+ [RESCTRL_CTRL_NAME_WGHT] = "WGHT",
};
const char *resctrl_ctrl_name_str(enum resctrl_ctrl_name name)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 72fb7256270e..09efcef9ce66 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -348,12 +348,14 @@ struct resctrl_mon {
* has the same name as the resource.
* @RESCTRL_CTRL_NAME_MIN: "MIN"
* @RESCTRL_CTRL_NAME_MAX: "MAX"
+ * @RESCTRL_CTRL_NAME_WGHT: "WGHT"
*/
enum resctrl_ctrl_name {
RESCTRL_CTRL_NAME_DEF,
RESCTRL_CTRL_NAME_MIN,
RESCTRL_CTRL_NAME_MAX,
- RESCTRL_CTRL_NAME_LAST = RESCTRL_CTRL_NAME_MAX
+ RESCTRL_CTRL_NAME_WGHT,
+ RESCTRL_CTRL_NAME_LAST = RESCTRL_CTRL_NAME_WGHT
};
> > - Controls are independent for now. This means that, for example, if a resource
> > supports a "MIN" and "MAX" control then this implementation would allow user to
> > set the "maximum" control values to be less than the "minimum" control values.
>
> I think this is ok as long as adding support for new controls in resctrl doesn't
> change the existing behaviour. In MPAM we dodged this by introducing MB as only
> affecting the h/w mbw_max and not mbw_min (as mentioned above).
There is no equivalent to MB (percentage throttle) in RISC-V so I would
want it to be valid to have MB_MIN (minimum reservation) without MB.
I rebased my RISC-V CBQRI v6 series on top of this proof of concept and
was able to validate it works okay in Qemu:
MB_WGHT:72=255
MB_MIN:72=756
L2:64=fff;65=fff
L3:75=ffff
Thanks,
Drew
[1] https://lore.kernel.org/all/20260601-ssqosid-cbqri-rqsc-v7-0-v6-6-baf00f50028a@kernel.org/
next prev parent reply other threads:[~2026-06-03 19:34 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 18:06 [RFC] mpam,x86,fs/resctrl: Generic schema description Proof of Concept Reinette Chatre
2026-06-02 20:23 ` Babu Moger
2026-06-02 22:56 ` Reinette Chatre
2026-06-03 1:14 ` Moger, Babu
2026-06-03 3:55 ` Reinette Chatre
2026-06-03 14:40 ` Babu Moger
2026-06-02 23:32 ` Chen, Yu C
2026-06-03 3:45 ` Reinette Chatre
2026-06-03 11:53 ` Chen, Yu C
2026-06-04 16:37 ` Reinette Chatre
2026-06-05 15:43 ` Chen, Yu C
2026-06-05 16:20 ` Reinette Chatre
2026-06-03 15:15 ` Ben Horgan
2026-06-03 19:34 ` Drew Fustini [this message]
2026-06-04 11:24 ` Ben Horgan
2026-06-04 17:38 ` Drew Fustini
2026-06-04 21:05 ` Reinette Chatre
2026-06-05 19:35 ` Drew Fustini
2026-06-06 5:10 ` Drew Fustini
2026-06-06 5:23 ` Drew Fustini
2026-06-04 17:43 ` Reinette Chatre
2026-06-05 14:53 ` Ben Horgan
2026-06-05 15:39 ` Reinette Chatre
2026-06-05 16:37 ` Ben Horgan
2026-06-03 18:46 ` Luck, Tony
2026-06-04 10:02 ` Ben Horgan
2026-06-04 21:42 ` Reinette Chatre
2026-06-03 22:14 ` Drew Fustini
2026-06-04 21:47 ` Reinette Chatre
2026-06-05 19:48 ` Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiCBratZchVFVhws@gen8 \
--to=fustini@kernel.org \
--cc=Dave.Martin@arm.com \
--cc=babu.moger@amd.com \
--cc=ben.horgan@arm.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=fenghuay@nvidia.com \
--cc=james.morse@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peternewman@google.com \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.