From: Dave Martin <Dave.Martin@arm.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: linux-kernel@vger.kernel.org,
Reinette Chatre <reinette.chatre@intel.com>,
James Morse <james.morse@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Jonathan Corbet <corbet@lwn.net>,
x86@kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be per-arch
Date: Mon, 29 Sep 2025 14:56:19 +0100 [thread overview]
Message-ID: <aNqQAy8nOkLRYx4F@e133380.arm.com> (raw)
In-Reply-To: <aNXJGw9r_k3BB4Xk@agluck-desk3>
Hi Tony,
Thanks for taking at look at this -- comments below.
[...]
On Thu, Sep 25, 2025 at 03:58:35PM -0700, Luck, Tony wrote:
> On Mon, Sep 22, 2025 at 04:04:40PM +0100, Dave Martin wrote:
[...]
> > Would something like the following work? A read from schemata might
> > produce something like this:
> >
> > MB: 0=50, 1=50
> > # MB_HW: 0=32, 1=32
> > # MB_MIN: 0=31, 1=31
> > # MB_MAX: 0=32, 1=32
[...]
> > I'd be interested in people's thoughts on it, though.
>
> Applying this to Intel upcoming region aware memory bandwidth
> that supports 255 steps and h/w min/max limits.
Following the MPAM example, would you also expect:
scale: 255
unit: 100pc
...?
> We would have info files with "min = 1, max = 255" and a schemata
> file that looks like this to legacy apps:
>
> MB: 0=50;1=75
> #MB_HW: 0=128;1=191
> #MB_MIN: 0=128;1=191
> #MB_MAX: 0=128;1=191
>
> But a newer app that is aware of the extensions can write:
>
> # cat > schemata << 'EOF'
> MB_HW: 0=10
> MB_MIN: 0=10
> MB_MAX: 0=64
> EOF
>
> which then reads back as:
> MB: 0=4;1=75
> #MB_HW: 0=10;1=191
> #MB_MIN: 0=10;1=191
> #MB_MAX: 0=64;1=191
>
> with the legacy line updated with the rounded value of the MB_HW
> supplied by the user. 10/255 = 3.921% ... so call it "4".
I'm suggesting that this always be rounded up, so that you have a
guarantee that the steps are no smaller than the reported value.
(In this case, round-up and round-to-nearest give the same answer
anyway, though!)
>
> The region aware h/w supports separate bandwidth controls for each
> region. We could hope (or perhaps update the spec to define) that
> region0 is always node-local DDR memory and keep the "MB" tag for
> that.
Do you have concerns about existing software choking on the #-prefixed
lines?
> Then use some other tag naming for other regions. Remote DDR,
> local CXL, remote CXL are the ones we think are next in the h/w
> memory sequence. But the "region" concept would allow for other
> options as other memory technologies come into use.
Would it be reasnable just to have a set of these schema instances, per
region, so:
MB_HW: ... // implicitly region 0
MB_HW_1: ...
MB_HW_2: ...
etc.
Or, did you have something else in mind?
My thinking is that we avoid adding complexity in the schemata file if
we treat mapping these schema instances onto the hardware topology as
an orthogonal problem. So long as we have unique names in the schemata
file, we can describe elsewhere what they relate to in the hardware.
Cheers
---Dave
next prev parent reply other threads:[~2025-09-29 13:56 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 16:24 [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be per-arch Dave Martin
2025-09-12 22:19 ` Reinette Chatre
2025-09-22 14:39 ` Dave Martin
2025-09-23 17:27 ` Reinette Chatre
2025-09-25 12:46 ` Dave Martin
2025-09-25 20:53 ` Reinette Chatre
2025-09-25 21:35 ` Luck, Tony
2025-09-25 22:18 ` Reinette Chatre
2025-09-29 13:08 ` Dave Martin
2025-09-29 12:43 ` Dave Martin
2025-09-29 15:38 ` Reinette Chatre
2025-09-29 16:10 ` Dave Martin
2025-10-15 15:18 ` Dave Martin
2025-10-16 15:57 ` Reinette Chatre
2025-10-17 15:52 ` Dave Martin
2025-09-22 15:04 ` Dave Martin
2025-09-25 22:58 ` Luck, Tony
2025-09-29 9:19 ` Chen, Yu C
2025-09-29 14:13 ` Dave Martin
2025-09-29 16:23 ` Luck, Tony
2025-09-30 11:02 ` Chen, Yu C
2025-09-30 16:08 ` Luck, Tony
2025-09-30 4:43 ` Chen, Yu C
2025-09-30 15:55 ` Dave Martin
2025-10-01 12:13 ` Chen, Yu C
2025-10-02 15:40 ` Dave Martin
2025-10-02 16:43 ` Luck, Tony
2025-09-29 13:56 ` Dave Martin [this message]
2025-09-29 16:09 ` Reinette Chatre
2025-09-30 15:40 ` Dave Martin
2025-10-10 16:48 ` Reinette Chatre
2025-10-11 17:15 ` Chen, Yu C
2025-10-13 15:01 ` Dave Martin
2025-10-13 14:36 ` Dave Martin
2025-10-14 22:55 ` Reinette Chatre
2025-10-15 15:47 ` Dave Martin
2025-10-15 18:48 ` Luck, Tony
2025-10-16 14:50 ` Dave Martin
2025-10-16 16:31 ` Reinette Chatre
2025-10-17 14:17 ` Dave Martin
2025-10-17 15:59 ` Reinette Chatre
2025-10-20 15:50 ` Dave Martin
2025-10-20 16:31 ` Luck, Tony
2025-10-21 14:37 ` Dave Martin
2025-10-21 20:59 ` Luck, Tony
2025-10-22 14:58 ` Dave Martin
2025-10-22 16:21 ` Luck, Tony
2025-10-23 14:04 ` Dave Martin
2025-09-29 16:37 ` Luck, Tony
2025-09-30 16:02 ` Dave Martin
2025-09-26 20:54 ` Reinette Chatre
2025-09-29 13:40 ` Dave Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aNqQAy8nOkLRYx4F@e133380.arm.com \
--to=dave.martin@arm.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox