linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Dave Martin <Dave.Martin@arm.com>
Cc: <linux-kernel@vger.kernel.org>, Tony Luck <tony.luck@intel.com>,
	"James Morse" <james.morse@arm.com>,
	Ben Horgan <ben.horgan@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Jonathan Corbet <corbet@lwn.net>, <x86@kernel.org>,
	<linux-doc@vger.kernel.org>
Subject: Re: [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to be per-arch
Date: Thu, 20 Nov 2025 13:55:22 -0800	[thread overview]
Message-ID: <e5928f9f-8c41-45d1-900f-6801eea775a0@intel.com> (raw)
In-Reply-To: <aR9KIxMIDrf0V95D@e133380.arm.com>

Hi Dave,

On 11/20/25 9:04 AM, Dave Martin wrote:
> On Wed, Nov 19, 2025 at 07:05:10PM -0800, Reinette Chatre wrote:
>> On 11/18/25 7:47 AM, Dave Martin wrote:
>>> On Fri, Nov 14, 2025 at 02:17:53PM -0800, Reinette Chatre wrote:
>>>> On 10/31/25 8:41 AM, Dave Martin wrote:


 
>>>> With resctrl "coercing" the user provided value before providing it to the architecture
>>>> it controls these control steps to match what the documentation states above. If resctrl
>>>> instead provides the value directly to the architecture I see nothing preventing the
>>>> architecture from ignoring resctrl's "contract" with user space documented above and
>>>> using arbitrary control steps since it also controls resctrl_arch_get_config() that is
>>>> displayed directly to user space. What guarantee is there that resctrl_arch_get_config()
>>>> will display a value that is "approximately" min_bw + N * bw_gran? This seems like opening
>>>
>>> No guarantee, but there will only be one resctrl_arch_preconvert_bw()
>>> per arch.  We'd expect the functions to be simple, so this doesn't feel
>>> like an excessive maintenance burden?
>>
>> Agree.
>>
>>>
>>> (All the resctrl_arch_foo() hooks have to do the right thing after all,
>>> otherwise nothing will work.)
>>>
>>>
>>> With this patch, resctrl_arch_preconvert_bw() and
>>> resctrl_arch_{update_domains,get_config}() between them must provide
>>> the correct behaviour.
>>
>> Right. This blurs the lines of responsibility. I interpret this as:
>> "resctrl fs makes promises to user space that resctrl fs *and* the architecture are
>>  responsible to keep"
> 
> Ack.  It's a joint responsbility.
> 
> Thinking about it, I don't think that the value returned by
> resctrl_arch_preconvert_bw() is used in any way except for storing it
> into rdt_ctrl_domain::staged_configs[] for later consumption by
> resctrl_arch_update_domains().
> 
> If so, the meaning of this value is really arch-determined.
> 
> It is convenient for the x86 implementation to convert this to hardware
> form at parse time, whereas, because of the way the MPAM code is
> structured, it suits the MPAM driver better to convert it later on.
> 
> In the x86 case, it means that the arch code doesn't have to
> distinguish much between input values for different kinds of control:
> it's just a matter of writing precomputed values to MSRs.  This keeps
> parts of the backend simple.
> 
> In the MPAM case, it's more about encapsulating grungy arch-specific
> details in the driver.  The common resctrl structures don't contain all
> the information needed.  We might have to pull a lot of junk out of the
> private headers in order to the conversion in the context of an inline
> function called by the schemata parser.
> 
> I think that it's likely that either all the conversion work is done in
> resctrl_arch_preconvert_bw() (e.g., x86), or all the work is done in
> the resctrl_arch_update_domains() backend (e.g., MPAM).  There's
> probably no good reason ever to split the work into two parts.
> 
> Does that make sense?

Good point. Yes.

> 
>>>
>>> But even today, resctrl_arch_update_domains() and
>>> resctrl_arch_get_config() have to do the right thing in order for
>>> bandwidth control values to be handled correctly, as seen through the
>>> schemata interface.
>>
>> ack.
>>
>>>
>>> (x86's job is easy, because the generic interface between the resctrl
>>> core and the arch interface happens to be expressed in terms of raw x86
>>> MSR values due to the history.  But other arches don't get the benefit
>>> of that.)
>>
>> That is just the benefit of being the first architecture to be supported.
>> If determined to cause headaches elsewhere it can surely change.
>>> The reason for this patch is the generic code can't do the right thing
>>> for MPAM, unless there is a hook into the arch code, arch-/hardware-
>>> specific knowledge is added in the core code, or unless a misleading
>>> bw_gran value is advertised.
>> Understood.
>>
>>>
>>> I tried to take the pragmatic approach, but I'm open to suggestions if
>>> you'd prefer this to be factored in another way.
>>
>> I am ok with this approach and will respond to the patch details separately.
> 
> OK, thanks -- I see you already replied, so I'll respond there.
> 
>>>
>>>> the door even wider for resctrl to become architecture specific ... with this change the
>>>> schemata file becomes a direct channel between user space and the arch that risks users
>>>> needing to tread carefully when switching between different architectures.
>>>
>>> There doesn't feel like a magic solution here.
>>>
>>> Exact bandwidth and flow control behaviour is extremely dependent on
>>> hardware topology and the design of interconnects and on dynamic system
>>> load.  An interface that is generic and rigorously defined, yet also
>>> simple enough to be reasonably usable by portable software would
>>> probably not be implementable on any real hardware platform.
>>>
>>> So, if it's useful to have a generic interface at all, hardware and
>>> software are going to have to meet in the middle somewhere...
>>
>> I believe that we could also use above as a quote in support of the schema
>> description work.
>>
>>>
>>> (The historical behaviour -- and even the interface -- of MB is not
>>> generic either, right?)
>>
>> Right. Even so, I prefer to use MB as motivation to do things better rather
>> than an excuse to make things worse.
>>
>> Reinette
> 
> Cheap shot, I know.
> 
> I guess that more is known now about what kinds of control behaviour
> are likely to exist than was known about when resctrl was first
> developed... though we still don't have a crystal ball.
> 
> My aim is generalise enough to cover most of what we know about today,
> while not inventing pie-in-the-sky abstractions that may never be
> fully used...  It's a balancing act, though.
> 
> There's a fine like between a "random nonportable junk" schema and
> reasonable exposure of an architecture-specific control that makes
> sense within resctrl but is unlikely to map onto other architectures.
> 
> We should certainly push back on the latter, but could it be appropriate
> to expose arch-specific control types in some situations?  I don't like
> to rule it out absolutely.

It is difficult to fully reason about merits without an example as reference.
We'll have better information when faced with enabling of such a control.
Some initial thoughts are: First, while a control may start as unique to an arch
we cannot predict that such a control will remain so. Second, with the new
schema descriptions it should be possible to easily add support for a
new control type. Of course, the first arch that supports the control type
has benefit of establishing the needed parameters. It would be ideal if a new
control could be mapped so that it appears all architectures support it but
I do not think resctrl should make that a requirement for support of a new
control.

Reinette




  reply	other threads:[~2025-11-20 21:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-31 15:41 [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to be per-arch Dave Martin
2025-11-10 11:06 ` Ben Horgan
2025-11-20 15:30   ` Dave Martin
2025-11-14 22:17 ` Reinette Chatre
2025-11-18 15:47   ` Dave Martin
2025-11-20  3:05     ` Reinette Chatre
2025-11-20 17:04       ` Dave Martin
2025-11-20 21:55         ` Reinette Chatre [this message]
2025-11-20 16:46 ` Reinette Chatre
2025-11-20 17:42   ` Dave Martin
2025-11-20 22:01     ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5928f9f-8c41-45d1-900f-6801eea775a0@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=ben.horgan@arm.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).