From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83FE348C3E6 for ; Wed, 3 Jun 2026 19:34:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780515248; cv=none; b=XJXmJaaavyC86/yKvlixF6pYEx6tVSinxl41XHClQDvnC0+OXS2gMAslKSgwOnEy3tlBLJWpuTVrhwGIFcJ2CWCnRq7hYnTahW56M/S2jQ0QESQ1xQMpp5U7yuVq6cN6Zij145gjdkYCG8vmXXjMZxih9OBXmunbUjXZZTeoOVM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780515248; c=relaxed/simple; bh=yErNHF4LkvyTQUVbWgy7F6uUU8TJUBHKcw5iuq0POKA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Msgsat0bfLLdryZxleoXQxKlUspWDkQhLrMmVo1DnNxfbJYuadjoYs3bLOyTnnHcPMp4u51+3FVej6K7J4eIH70FmbxusAwRLklma9LzJbouYWvWSO6uQmu6zqGWyGm/OOVta/njb31sko3UcuxGzTclgvGkq1D21mUvYDOfQ9Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PGLDxO9S; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PGLDxO9S" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EC91F1F00893; Wed, 3 Jun 2026 19:34:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780515247; bh=ghooph/409zSeJZ76KH/VjU4JUbqBO3avnZaPVqCG3Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=PGLDxO9S2rIYaXHlWj2emScsy9BC28QNGFQMeNXQG/+wV2icsJNHDZsUGifOC36BT I0Wq0XcStCWBYgkNiQOo+fntkEJP2AlpWpRpEs6mmDfH0J1nzjtpG5QQh8qCeqvifL LTJPKUJtgtQyL8xoD01iWm06A9rCkRA2AHwSIAF7lRNycinXRORdZGeUk1rtMv5dxS QR6s1hZuXItsvj5ZxyZMZJkIi9kCyu6YzAQRfhwwglEvKqGzxvXrHKz2zhfae5wSie YyvpPS4M3TjCWDqO37LLkk5lAkXZgOM5WmMyTlInsHu7O8uH/IM2cZe20wrcqbltLE GcL/R46TolFYQ== Date: Wed, 3 Jun 2026 12:34:05 -0700 From: Drew Fustini To: Ben Horgan Cc: Reinette Chatre , Tony Luck , James Morse , Dave Martin , Babu Moger , Fenghua Yu , Chen Yu , Borislav Petkov , Thomas Gleixner , Dave Hansen , Peter Newman , "x86@kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [RFC] mpam,x86,fs/resctrl: Generic schema description Proof of Concept Message-ID: References: <29c95b69-e1a4-46b1-ab8b-45c09308b924@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <29c95b69-e1a4-46b1-ab8b-45c09308b924@arm.com> On Wed, Jun 03, 2026 at 04:15:51PM +0100, Ben Horgan wrote: > Hi Reinette, > > On 5/29/26 19:06, Reinette Chatre wrote: > > Hi Everybody, > > > > It has been a while since we discussed the resctrl changes required to support > > hardware that has controls with fine granularity or hardware that has multiple > > controls per resource. For reference, the most recent email discussion can > > be found at [1] with a summary of discussions in last year's plumbers slides [2]. > > > > I created a PoC that I believe supports what folks have agreed to so far. I > > hope this can help us to restart the discussion with the goal that resctrl gains > > support for upcoming hardware that require these features. > > Thank you very much for doing this work. I believe this will be very useful for > MPAM and other architectures. Yes, thanks to Reinette for working on the generic schema proof of concept. This will be helpful for supporting the RISC-V CBQRI (capacity and bandwidth QoS) spec. > I plumbed in support for the MB_MIN resource schema which also works under light > testing. The only fs resctrl code change I needed was: > > --- a/include/linux/resctrl.h > +++ b/include/linux/resctrl.h > @@ -483,6 +483,9 @@ static inline u32 resctrl_get_default_ctrlval(struct > resctrl_ctrl *ctrl) > case RESCTRL_CTRL_BITMAP: > return BIT_MASK(ctrl->cache.cbm_len) - 1; > case RESCTRL_CTRL_SCALAR: > + if (ctrl->name == RESCTRL_CTRL_NAME_MIN) > + return ctrl->membw.min_bw; > + > return ctrl->membw.max_bw; > } > > > At least on MPAM systems, we use a default of 0 for minimum bandwidth controls > as the maximum bandwidth controls only take effect if their value is higher than > the minimum bandwidth value. I have specialised this on the ctrl->name which > breaks your ctrl->type based classification but that's fixable by just adding a > default field to membw. This should be useful for RISC-V. RESCTRL_CTRL_NAME_MIN maps well to CBQRI Rbwb (reserved bandwidth blocks). The sum of Rbwb across all control groups must be less than MRBWB (maximum number of reserved bandwidth blocks). As a result, MB_MIN needs to default to 1 so that the sum does not violate that rule. In my RFC series, I added default_to_min to resctrl_membw [1] but this solution looks cleaner. > > - No support for "read-modify-write" usage of schemata file. This is where we > > discussed (without agreement) on possibly introducing the "#" prefix to schemata > > file entries. This PoC does not support this prefix and the current assumption/expectation > > is that when user space changes a configuration only the new control values are > > written to schemata file. I thus do not have a plan to support this so please > > share opinions in this regard if you have some. > > There is now less motivation from the MPAM side for this than when this was > initially discussed. In pre-upstream versions of the MPAM patches a change in > the MB resource control value would change both the mpam h/w mbw_min and mbw_max > values but now (on non-broken h/w) we just change the mbw_max. (mbw_min kept at 0). > > However, it would be useful not to be limited by percentages. In my quick > experimentation with your patches I used a percentage value for MB_MIN but it > would be best to move away from this. For new controls I think we can mandate > that user space has to discover the resolution from the info directly but how > can we retrofit this. For MPAM, MB and MB_MAX, would control the same things. > Could we just add MB_MAX with a h/w friendly scale and then reflect changes in > MB_MAX in MB and vica versa with MB taking precedent if both are set? Old > software can continue setting MB can move to using MB_MAX and take advantage of > the improved control. (I don't think we should expose the MPAM hardware value > directly as it has confusion over whether all 1s is 100% or not and we'd like to > have something generic and friendly to the user.) The facility for non-percentage value is import for RISC-V as CBQRI does not include percentage throttle. It has two controls for bandwidth: - Rbwb: number of reserved bandwidth blocks [1, 2^13] - Mweight: weighted share of the remaining bandwidth [0, 255] - 0: disables work-conserving sharing - 1..255: compete for the leftover pool - It makes for it to default to max (255) so that there won't be any unused bandwidth I think Mweight could be aligned with MPAM's proportional stride. Here is the patch I created to add Mweight support: diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c index d95ab8ad36e2..3537071e3ab0 100644 --- a/fs/resctrl/ctrlmondata.c +++ b/fs/resctrl/ctrlmondata.c @@ -304,6 +304,7 @@ static const char * const resctrl_ctrl_name[] = { [RESCTRL_CTRL_NAME_DEF] = "", [RESCTRL_CTRL_NAME_MIN] = "MIN", [RESCTRL_CTRL_NAME_MAX] = "MAX", + [RESCTRL_CTRL_NAME_WGHT] = "WGHT", }; const char *resctrl_ctrl_name_str(enum resctrl_ctrl_name name) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 72fb7256270e..09efcef9ce66 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -348,12 +348,14 @@ struct resctrl_mon { * has the same name as the resource. * @RESCTRL_CTRL_NAME_MIN: "MIN" * @RESCTRL_CTRL_NAME_MAX: "MAX" + * @RESCTRL_CTRL_NAME_WGHT: "WGHT" */ enum resctrl_ctrl_name { RESCTRL_CTRL_NAME_DEF, RESCTRL_CTRL_NAME_MIN, RESCTRL_CTRL_NAME_MAX, - RESCTRL_CTRL_NAME_LAST = RESCTRL_CTRL_NAME_MAX + RESCTRL_CTRL_NAME_WGHT, + RESCTRL_CTRL_NAME_LAST = RESCTRL_CTRL_NAME_WGHT }; > > - Controls are independent for now. This means that, for example, if a resource > > supports a "MIN" and "MAX" control then this implementation would allow user to > > set the "maximum" control values to be less than the "minimum" control values. > > I think this is ok as long as adding support for new controls in resctrl doesn't > change the existing behaviour. In MPAM we dodged this by introducing MB as only > affecting the h/w mbw_max and not mbw_min (as mentioned above). There is no equivalent to MB (percentage throttle) in RISC-V so I would want it to be valid to have MB_MIN (minimum reservation) without MB. I rebased my RISC-V CBQRI v6 series on top of this proof of concept and was able to validate it works okay in Qemu: MB_WGHT:72=255 MB_MIN:72=756 L2:64=fff;65=fff L3:75=ffff Thanks, Drew [1] https://lore.kernel.org/all/20260601-ssqosid-cbqri-rqsc-v7-0-v6-6-baf00f50028a@kernel.org/