* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Moger, Babu @ 2026-04-22 0:17 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <de608041-bc45-4ca0-81fe-423a5167d7d0@intel.com>
Hi Reinette,
On 4/21/2026 5:44 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/21/26 3:04 PM, Moger, Babu wrote:
>> My bad. My only motivation was to keep the mode listing display consistent.
>
> The listing display is already inconsistent since the different modes have different
> global properties, no?
>
Yes. That is true.
>>
>> That said, I agree we need to support this. Without it, we won’t be able to move the group from PLZA to non-PLZA.
>>
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon:
>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>> [global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
>
> Like above where the listing is inconsistent. Is this what you mean?
I meant the listing of "inherit_ctrl_and_mon" does not have groups while
other modes have it.
>
> sidenote: Should the last line be "[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]"?
Yes.
>
>>
>> # echo "inherit_ctrl_and_mon:group=ctrl1/mon1/" > info/kernel_mode
>
> This does not look right. Why is a "group" property needed here? Can the mode not just
> be set by itself? Specifically, why not just:
>
> # echo "inherit_ctrl_and_mon" > info/kernel_mode
We can go with this based on your another comment below. While changing
the mode use the defaults if properties are not provided.
>
> This reminds me that there is still an open remaining from
> https://lore.kernel.org/lkml/71099958-1ddf-40dc-8a3c-aa13d0c56fee@intel.com/
> Specifically this from that message:
> The named fields could be made optional, if group is omitted then it will become the
> default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
> This may not be intuitive since a user may expect that not mentioning a field means
> that the field is left untouched. Have you considered this scenario in your proposal?
>
> I think this needs some clear description of behavior wrt properties, for example:
> - Is it required to provide all properties on each write? More specifically, can user expect there
> to be "default" values when a property is not provided or is user required to provide a value
> for each property? We need to be careful here because we do not want user scripts to fail when a new
> property is added in the future. What if resctrl specifies that if user space does not provide
> a property then resctrl will pick a default. For example, if user runs:
> # echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
> then resctrl will switch to "global_assign_ctrl_assign_mon_per_cpu" mode initialized to
> the default group.
> I am not sure if resctrl needs to support re-configuration of modes in the future where the
> mode stays the same but a property changes? Consider, for example,
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon:]
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>
> # echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
> /*
> * resctrl switches to "global_assign_ctrl_assign_mon_per_cpu" mode and sets
> * PLZA group to default group
> */
> # cat info/kernel_mode
> inherit_ctrl_and_mon:
> [global_assign_ctrl_assign_mon_per_cpu:group=//]
> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
> /*
> * resctrl stays in "global_assign_ctrl_assign_mon_per_cpu" mode and sets
> * PLZA group to default group
> */
I think you meant "PLZA group to ctrl1/mon1/" here.
> # cat info/kernel_mode
> inherit_ctrl_and_mon:
> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
> # echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
> /*
> * TBD: should resctrl switch back to default group or just keep
> * group as ctrl1/mon1/ ?
> */
>
> resctrl could thus specify different behavior for switching to a mode where all properties
> not specified obtains default values and re-configuring a mode where only specified
> properties are changed. That means, the "TBD" above would be that the group stays
> as ctrl1/mon1/. So,
> # cat info/kernel_mode
> inherit_ctrl_and_mon:
> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
>
> What do you think?
Yes. Sure. We can do that. We only have 2 properties now (mode and
group). We should be able to handle that.
>
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon:
>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>> [global_assign_ctrl_assign_mon_per_cpu]:group=uninitialized
> This does not look right. After switching the kernel_mode to inherit_ctrl_and_mon
> I expect inherit_ctrl_and_mon to be the active mode?
Yes. inherit_ctrl_and_mon should be active here.
Thanks
Babu
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 22:44 UTC (permalink / raw)
To: Moger, Babu, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <9d8a18da-14e4-4d90-a224-7d69d4daeb13@amd.com>
Hi Babu,
On 4/21/26 3:04 PM, Moger, Babu wrote:
> My bad. My only motivation was to keep the mode listing display consistent.
The listing display is already inconsistent since the different modes have different
global properties, no?
>
> That said, I agree we need to support this. Without it, we won’t be able to move the group from PLZA to non-PLZA.
>
> # cat info/kernel_mode
> inherit_ctrl_and_mon:
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
> [global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
Like above where the listing is inconsistent. Is this what you mean?
sidenote: Should the last line be "[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]"?
>
> # echo "inherit_ctrl_and_mon:group=ctrl1/mon1/" > info/kernel_mode
This does not look right. Why is a "group" property needed here? Can the mode not just
be set by itself? Specifically, why not just:
# echo "inherit_ctrl_and_mon" > info/kernel_mode
This reminds me that there is still an open remaining from
https://lore.kernel.org/lkml/71099958-1ddf-40dc-8a3c-aa13d0c56fee@intel.com/
Specifically this from that message:
The named fields could be made optional, if group is omitted then it will become the
default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
This may not be intuitive since a user may expect that not mentioning a field means
that the field is left untouched. Have you considered this scenario in your proposal?
I think this needs some clear description of behavior wrt properties, for example:
- Is it required to provide all properties on each write? More specifically, can user expect there
to be "default" values when a property is not provided or is user required to provide a value
for each property? We need to be careful here because we do not want user scripts to fail when a new
property is added in the future. What if resctrl specifies that if user space does not provide
a property then resctrl will pick a default. For example, if user runs:
# echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
then resctrl will switch to "global_assign_ctrl_assign_mon_per_cpu" mode initialized to
the default group.
I am not sure if resctrl needs to support re-configuration of modes in the future where the
mode stays the same but a property changes? Consider, for example,
# cat info/kernel_mode
[inherit_ctrl_and_mon:]
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
# echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
/*
* resctrl switches to "global_assign_ctrl_assign_mon_per_cpu" mode and sets
* PLZA group to default group
*/
# cat info/kernel_mode
inherit_ctrl_and_mon:
[global_assign_ctrl_assign_mon_per_cpu:group=//]
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
/*
* resctrl stays in "global_assign_ctrl_assign_mon_per_cpu" mode and sets
* PLZA group to default group
*/
# cat info/kernel_mode
inherit_ctrl_and_mon:
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
# echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
/*
* TBD: should resctrl switch back to default group or just keep
* group as ctrl1/mon1/ ?
*/
resctrl could thus specify different behavior for switching to a mode where all properties
not specified obtains default values and re-configuring a mode where only specified
properties are changed. That means, the "TBD" above would be that the group stays
as ctrl1/mon1/. So,
# cat info/kernel_mode
inherit_ctrl_and_mon:
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
What do you think?
> # cat info/kernel_mode
> inherit_ctrl_and_mon:
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
> [global_assign_ctrl_assign_mon_per_cpu]:group=uninitialized
This does not look right. After switching the kernel_mode to inherit_ctrl_and_mon
I expect inherit_ctrl_and_mon to be the active mode?
Reinette
^ permalink raw reply
* Re: [PATCH v2 06/31] x86/virt/tdx: Read global metadata for TDX Module Extensions/Connect
From: Dan Williams @ 2026-04-21 22:19 UTC (permalink / raw)
To: Xu Yilun, linux-coco, linux-pci, x86
Cc: chao.gao, dave.jiang, baolu.lu, yilun.xu, yilun.xu,
zhenzhong.duan, kvm, rick.p.edgecombe, dave.hansen, kas,
xiaoyao.li, vishal.l.verma, linux-kernel
In-Reply-To: <20260327160132.2946114-7-yilun.xu@linux.intel.com>
Xu Yilun wrote:
> Add reading of the global metadata for TDX Module Extensions & TDX
> Connect. Add them in a batch as TDX Connect is currently the only user
> of TDX Module Extensions and no way to initialize TDX Module Extensions
> without firstly enabling TDX Connect.
>
> TDX Module Extensions & TDX Connect are optional features enumerated by
> TDX_FEATURES0. Check the TDX_FEATURES0 before reading these metadata to
> avoid failing the whole TDX initialization.
I think it is important to distinguish "optional" module features vs
required Linux features. Linux requires all features that a module
advertises to succeed at core TDX init time.
Otherwise, this looks ok / consistent with other metadata reading. It
sets the precedent that if TDX Connect is advertised it must succeed all
core initialization.
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Moger, Babu @ 2026-04-21 22:04 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <c9e10de7-f5b1-4a38-be1f-f75bc1ae7780@intel.com>
Hi Reinette,
On 4/21/2026 3:57 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/21/26 11:19 AM, Babu Moger wrote:
>> On 4/21/26 12:35, Reinette Chatre wrote:
>>> On 4/21/26 9:46 AM, Babu Moger wrote:
>>>> On 4/21/26 11:15, Reinette Chatre wrote:
>>>>> On 4/21/26 8:08 AM, Babu Moger wrote:
>
>
>>>>>>
>>>>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>>>>>
>>>>>> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
>
> Here you question why "inherit_ctrl_and_mon" is needed ...
>
>>>>>>
>>>>>>
>>>>>> System boots up with following options if PLZA is supported.
>>>>>>
>>>>>> # cat info/kernel_mode
>>>>>> global_assign_ctrl_assign_mon_per_cpu
>>>>>> global_assign_ctrl_inherit_mon_per_cpu
>>>>>>
>>>>>> No groups are associated with kernel mode at this point.
>>>>>
>>>>> To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
>>>>> above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
>>>>> we will need to add documentation that describes to user space how this file should be interpreted.
>>>>> It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
>>>>>
>>>>> I think something like below will be more intuitive and not need much additional
>>>>> documentation to understand (I am just adding the "uninitialized" as an example to match text
>>>>> printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
>>>>> the lack of "/" could be used to make it clear what this means?):
>>>>>
>>>>> # cat info/kernel_mode
>>>>> [inherit_ctrl_and_mon]
>>>>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>>>>> global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
>>>>>
>
> Above I share considerations when thinking whether to keep "inherit_ctrl_and_mon" or not ...
>
>>>>
>>>> Sounds ok to me.
>
> ... to which you seem to agree ...
>
>>>>
>>>>
>>>>> I also think an interface like this would be simpler for user space to use as it (user space) switches
>>>>> between PLZA capable and non-PLZA capable systems since user space need not associate existence of
>>>>> the file with some kernel mode state in addition to actual content of the file when it does exist.
>
>
> ... more considerations from me when thinking whether to keep "inherit_ctrl_and_mon" or not ...
My bad. My only motivation was to keep the mode listing display consistent.
That said, I agree we need to support this. Without it, we won’t be able
to move the group from PLZA to non-PLZA.
# cat info/kernel_mode
inherit_ctrl_and_mon:
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
[global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
# echo "inherit_ctrl_and_mon:group=ctrl1/mon1/" > info/kernel_mode
# cat info/kernel_mode
inherit_ctrl_and_mon:
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
[global_assign_ctrl_assign_mon_per_cpu]:group=uninitialized
Thanks
Babu
^ permalink raw reply
* Re: [PATCH v2 05/31] x86/virt/tdx: Extend tdx_page_array to support IOMMU_MT
From: Dan Williams @ 2026-04-21 21:51 UTC (permalink / raw)
To: Xu Yilun, Dan Williams
Cc: Edgecombe, Rick P, Gao, Chao, Xu, Yilun, x86@kernel.org,
kas@kernel.org, baolu.lu@linux.intel.com,
dave.hansen@linux.intel.com, Li, Xiaoyao, Jiang, Dave,
linux-pci@vger.kernel.org, linux-coco@lists.linux.dev,
linux-kernel@vger.kernel.org, Duan, Zhenzhong, Verma, Vishal L,
kvm@vger.kernel.org
In-Reply-To: <aeSTPuR9cuga+I69@yilunxu-OptiPlex-7050>
Xu Yilun wrote:
> On Fri, Apr 17, 2026 at 04:58:43PM -0700, Dan Williams wrote:
> > Xu Yilun wrote:
> > [..]
> > > >
> > > > I'm drafting some changes and make the tdx_page_array look like:
> > > >
> > > > struct tdx_page_array {
> > > > /* public: */
> > > > unsigned int nr_pages;
> > > > struct page **pages;
> > > >
> > > > /* private: */
> > > > u64 *root;
> > > > bool flush_on_free;
> >
> > How about "need_phymem_page_wbinvd"?
>
> Yes.
>
> >
> > That makes it a bit more greppable and not to be confused with other
> > flushing.
> >
> > [..]
> > > Hi, I end up made the following changes on top of this series:
> > >
> > > -------8<--------
> > >
> > > arch/x86/include/asm/tdx.h | 32 +-
> > > arch/x86/virt/vmx/tdx/tdx.c | 561 ++++++++------------------
> > > drivers/virt/coco/tdx-host/tdx-host.c | 179 ++++++--
> > > 3 files changed, 316 insertions(+), 456 deletions(-)
> > >
> > > + ret = tdx_ext_mem_setup(nr_pages, &ext_mem);
> > > if (ret)
> > > + return ret;
> > > }
> > >
> > > + ret = tdx_ext_init();
> > > + if (ret)
> > > + goto out_remove_ext_mem;
> > > +
> > > /*
> > > + * Extensions memory is never reclaimed once assigned, stop tracking it
> > > + * and free the tracking structures.
> > > */
> > > + tdx_page_array_free(ext_mem.chunk);
> >
> > Wait, these pages belong to the module now, they can't be freed, or I am
> > missing something?
>
> With this new solution, tdx_page_array is downgraded to a descriptor,
> doesn't manage the actual data pages/memory any more. So
> tdx_page_array_free() will not free data pages, only frees the
> tdx_page_array descriptor.
Oh, I was confused by the fact that tdx_page_array_free() still loops
through array->pages in the need_wbinvd case. In the case of "never
reclaim" it will also "never wbinvd". ...and this why populate has that
"WARN_ON_ONCE(array->pages && array->flush_on_free);".
A couple recommendations come to mind:
* s/tdx_page_array_free/tdx_page_array_destroy/
...since "destroy" mirrors create and matches other cases where only
metadata is managed.
* Create a new tdx_page_array_repopulate() helper to make it clear which
paths depend on being able to repopulate and move the WARN_ON_ONCE() out of
the common path that does not repopulate. "repopulate" can have
"realloc" semantics where it allocates on first use, but otherwise
"populate" gets to not care about the corner cases. Make the WARN case
fail repopulate.
> > > pr_info("%lu KB allocated for TDX Module Extensions\n",
> > > nr_pages * PAGE_SIZE / 1024);
> > >
> > > return 0;
> > >
> > > -out_flush:
> > > - if (ext_mem)
> > > +out_remove_ext_mem:
> > > + if (nr_pages) {
> > > + /*
> > > + * TDH.EXT.MEM.ADD only collects required memory. TDX.EXT.INIT
> > > + * does the actual initialization so if it fails some pages may
> > > + * have been touched by the TDX module, flush cache before
> > > + * returning these pages to kernel.
> > > + */
> > > wbinvd_on_all_cpus();
> > > + tdx_ext_mem_remove(&ext_mem);
> >
> > This only releases the last populated chunk, not all previous chunks,
> > right?
>
> Not true. ext_mem stores all the data pages and the reusable descriptor
> 'chunk' for SEAMCALL. tdx_ext_mem_remove() removes all the data pages
> and the 'chunk'.
Yes, see that now.
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 20:57 UTC (permalink / raw)
To: Babu Moger, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <d693f797-65f6-46ed-bd49-beaeee2da858@amd.com>
Hi Babu,
On 4/21/26 11:19 AM, Babu Moger wrote:
> On 4/21/26 12:35, Reinette Chatre wrote:
>> On 4/21/26 9:46 AM, Babu Moger wrote:
>>> On 4/21/26 11:15, Reinette Chatre wrote:
>>>> On 4/21/26 8:08 AM, Babu Moger wrote:
>>>>>
>>>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>>>>
>>>>> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
Here you question why "inherit_ctrl_and_mon" is needed ...
>>>>>
>>>>>
>>>>> System boots up with following options if PLZA is supported.
>>>>>
>>>>> # cat info/kernel_mode
>>>>> global_assign_ctrl_assign_mon_per_cpu
>>>>> global_assign_ctrl_inherit_mon_per_cpu
>>>>>
>>>>> No groups are associated with kernel mode at this point.
>>>>
>>>> To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
>>>> above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
>>>> we will need to add documentation that describes to user space how this file should be interpreted.
>>>> It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
>>>>
>>>> I think something like below will be more intuitive and not need much additional
>>>> documentation to understand (I am just adding the "uninitialized" as an example to match text
>>>> printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
>>>> the lack of "/" could be used to make it clear what this means?):
>>>>
>>>> # cat info/kernel_mode
>>>> [inherit_ctrl_and_mon]
>>>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>>>> global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
>>>>
Above I share considerations when thinking whether to keep "inherit_ctrl_and_mon" or not ...
>>>
>>> Sounds ok to me.
... to which you seem to agree ...
>>>
>>>
>>>> I also think an interface like this would be simpler for user space to use as it (user space) switches
>>>> between PLZA capable and non-PLZA capable systems since user space need not associate existence of
>>>> the file with some kernel mode state in addition to actual content of the file when it does exist.
... more considerations from me when thinking whether to keep "inherit_ctrl_and_mon" or not ...
>>>>
>>>> I assumed that info/kernel_mode can just always be made visible and not depend on PLZA
>>>> capable hardware. This means that on Intel and Arm this file can show:
>>>>
>>>> # cat info/kernel_mode
>>>> [inherit_ctrl_and_mon]
>>>>
>>>
>>> Yes. Sure.
... to which you seem to agree ...
>>>
>>>
>>>> For Intel this is accurate and also for Arm if I interpret the Arm implementation correctly
>>>> (see mpam_thread_switch()) in https://lore.kernel.org/lkml/20260313144617.3420416-7-ben.horgan@arm.com/
... and even more considerations from me when thinking whether to keep "inherit_ctrl_and_mon" or not.
...
>>> There is one problem here. The mode "inherit_ctrl_and_mon" listing not consistent with others.
>>
>> It is difficult to predict what resctrl will be asked to support next. One possibility here is
>> to make it part of the original design that the first field is the "mode" and the following field
>> contains that mode's global properties of which there could be more than one. Above shows that
>> the two "global" modes have a single global property but we could just try to be safe with some
>> documentation that states there could be more.
>>
>> Consider for example some hypothetical future where the file looks like:
>>
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon:some_unique_capability=true
>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized;other_property=val
>> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
>>
>> To leave room for growth the file could start out by, for example, appending ":"
>> to "inherit_ctrl_and_mon" to indicate that there are no known properties yet? Something like
>> below. Would this be more consistent with the others?
>
> To me, it might be clearer to simply document what the default mode is when kernel mode is not enabled, and omit "inherit_ctrl_and_mon" from the display.
... and now you question again why "inherit_ctrl_and_mon" should be included in display without
a motivation why and without addressing any of the previous considerations motivating its
inclusion. How can I respond when you clearly ignore my response to the previous time you asked
this question?
My previous comments are still valid. You mention that "it might be clearer to simply document what
the default mode is when kernel mode is not enabled". To me there is not really a "disabled" kernel mode
since kernel work done on behalf of a task needs to be done with *some* allocation - kernel mode is not
"disabled". Why should resctrl not make it clear what this behavior is? Adding another consideration to
the list ... what if resctrl needs to support some other "default" mode in the future? How can a user
know that not having an active mode means one or the other "default" mode?
If you feel that "inherit_ctrl_and_mon" should be omitted then please motivate why and also address why
the considerations I mentioned are not valid.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Babu Moger @ 2026-04-21 18:19 UTC (permalink / raw)
To: Reinette Chatre, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <0334ba64-71b3-40bd-8cce-9f0f119e7dc9@intel.com>
Hi Reinette,
On 4/21/26 12:35, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/21/26 9:46 AM, Babu Moger wrote:
>> On 4/21/26 11:15, Reinette Chatre wrote:
>>> On 4/21/26 8:08 AM, Babu Moger wrote:
>
>>> It sounds like we are saying the same thing?
>>> When considering all the sharp corners I agree that keeping kernel_mode_cpus/kernel_mode_cpuslist
>>> seems most user friendly. When doing so there is no need to include CPU assignment in the global
>>> files.
>>
>> Actually, I was talking about removing _per_cpu extension also as the per-CPU requirement is handled inside the group using kernel_mode_cpus/kernel_mode_cpuslist. It can be documented.
>>
>> global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_assign_mon
>> global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_inherit_mon
>
> I see. The goal with this name choice was to distinguish a global mode that
> additionally supports per-CPU assignment from a "true/pure" global mode that
> does not support per-CPU assignment.
>
> If resctrl ever needs to support such "true/pure" global mode that does
> not support per-CPU assignment then resctrl will need to either come up with
> a new mode that does not expose kernel_mode_cpus/kernel_mode_cpuslist or
> make kernel_mode_cpus/kernel_mode_cpuslist read-only. The latter adds the
> complication that user space can always change the mode of a file so resctrl
> would need to add corner cases for that.
>
> To me the "per_cpu" distinction is useful since it make it clear to user space
> that even though this is a "global" configuration it additionally supports
> per-CPU assignment for which user space can expect kernel_mode_cpus/kernel_mode_cpuslist
> to exist and be writable. To me this makes the interface clear and intuitive.
ok. Sure.
>
>>>>
>>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>>>
>>>> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
>>>>
>>>>
>>>> System boots up with following options if PLZA is supported.
>>>>
>>>> # cat info/kernel_mode
>>>> global_assign_ctrl_assign_mon_per_cpu
>>>> global_assign_ctrl_inherit_mon_per_cpu
>>>>
>>>> No groups are associated with kernel mode at this point.
>>>
>>> To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
>>> above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
>>> we will need to add documentation that describes to user space how this file should be interpreted.
>>> It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
>>>
>>> I think something like below will be more intuitive and not need much additional
>>> documentation to understand (I am just adding the "uninitialized" as an example to match text
>>> printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
>>> the lack of "/" could be used to make it clear what this means?):
>>>
>>> # cat info/kernel_mode
>>> [inherit_ctrl_and_mon]
>>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>>> global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
>>>
>>
>> Sounds ok to me.
>>
>>
>>> I also think an interface like this would be simpler for user space to use as it (user space) switches
>>> between PLZA capable and non-PLZA capable systems since user space need not associate existence of
>>> the file with some kernel mode state in addition to actual content of the file when it does exist.
>>>
>>> I assumed that info/kernel_mode can just always be made visible and not depend on PLZA
>>> capable hardware. This means that on Intel and Arm this file can show:
>>>
>>> # cat info/kernel_mode
>>> [inherit_ctrl_and_mon]
>>>
>>
>> Yes. Sure.
>>
>>
>>> For Intel this is accurate and also for Arm if I interpret the Arm implementation correctly
>>> (see mpam_thread_switch()) in https://lore.kernel.org/lkml/20260313144617.3420416-7-ben.horgan@arm.com/
>>>
>>>>
>>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>>>>
>>>> # cat info/kernel_mode
>>>> global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>>> global_assign_ctrl_inherit_mon_per_cpu
>>>>
>>>>
>>>> # echo "global_assign_ctrl_inherit_mon_per_cpu:group=//" > info/kernel_mode
>>>>
>>>>
>>>> # cat info/kernel_mode
>>>> global_assign_ctrl_assign_mon_per_cpu
>>>> global_assign_ctrl_inherit_mon_per_cpu:group=//
>>>>
>>>>
>>>> How does this look?
>>>
>>> In addition to above I think it will be helpful to add a clear indication to user
>>> space on what the current active mode is, for example, via the [] characters.
>>
>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>>
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon
>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>> [global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
>>
>> Something like this?
>
> How about making it clear that the whole line/configuration is active, like below:
>
> # cat info/kernel_mode
> inherit_ctrl_and_mon
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
>
>
ok. Sure.
>>
>> There is one problem here. The mode "inherit_ctrl_and_mon" listing not consistent with others.
>
> It is difficult to predict what resctrl will be asked to support next. One possibility here is
> to make it part of the original design that the first field is the "mode" and the following field
> contains that mode's global properties of which there could be more than one. Above shows that
> the two "global" modes have a single global property but we could just try to be safe with some
> documentation that states there could be more.
>
> Consider for example some hypothetical future where the file looks like:
>
> # cat info/kernel_mode
> inherit_ctrl_and_mon:some_unique_capability=true
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized;other_property=val
> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
>
> To leave room for growth the file could start out by, for example, appending ":"
> to "inherit_ctrl_and_mon" to indicate that there are no known properties yet? Something like
> below. Would this be more consistent with the others?
To me, it might be clearer to simply document what the default mode is
when kernel mode is not enabled, and omit "inherit_ctrl_and_mon" from
the display.
That said, I’m fine with either approach.
Thanks
Babu
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 17:35 UTC (permalink / raw)
To: Babu Moger, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <a46f4f2d-e3f1-454f-b94b-c54e14e45a69@amd.com>
Hi Babu,
On 4/21/26 9:46 AM, Babu Moger wrote:
> On 4/21/26 11:15, Reinette Chatre wrote:
>> On 4/21/26 8:08 AM, Babu Moger wrote:
>> It sounds like we are saying the same thing?
>> When considering all the sharp corners I agree that keeping kernel_mode_cpus/kernel_mode_cpuslist
>> seems most user friendly. When doing so there is no need to include CPU assignment in the global
>> files.
>
> Actually, I was talking about removing _per_cpu extension also as the per-CPU requirement is handled inside the group using kernel_mode_cpus/kernel_mode_cpuslist. It can be documented.
>
> global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_assign_mon
> global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_inherit_mon
I see. The goal with this name choice was to distinguish a global mode that
additionally supports per-CPU assignment from a "true/pure" global mode that
does not support per-CPU assignment.
If resctrl ever needs to support such "true/pure" global mode that does
not support per-CPU assignment then resctrl will need to either come up with
a new mode that does not expose kernel_mode_cpus/kernel_mode_cpuslist or
make kernel_mode_cpus/kernel_mode_cpuslist read-only. The latter adds the
complication that user space can always change the mode of a file so resctrl
would need to add corner cases for that.
To me the "per_cpu" distinction is useful since it make it clear to user space
that even though this is a "global" configuration it additionally supports
per-CPU assignment for which user space can expect kernel_mode_cpus/kernel_mode_cpuslist
to exist and be writable. To me this makes the interface clear and intuitive.
>>>
>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>>
>>> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
>>>
>>>
>>> System boots up with following options if PLZA is supported.
>>>
>>> # cat info/kernel_mode
>>> global_assign_ctrl_assign_mon_per_cpu
>>> global_assign_ctrl_inherit_mon_per_cpu
>>>
>>> No groups are associated with kernel mode at this point.
>>
>> To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
>> above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
>> we will need to add documentation that describes to user space how this file should be interpreted.
>> It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
>>
>> I think something like below will be more intuitive and not need much additional
>> documentation to understand (I am just adding the "uninitialized" as an example to match text
>> printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
>> the lack of "/" could be used to make it clear what this means?):
>>
>> # cat info/kernel_mode
>> [inherit_ctrl_and_mon]
>> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
>> global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
>>
>
> Sounds ok to me.
>
>
>> I also think an interface like this would be simpler for user space to use as it (user space) switches
>> between PLZA capable and non-PLZA capable systems since user space need not associate existence of
>> the file with some kernel mode state in addition to actual content of the file when it does exist.
>>
>> I assumed that info/kernel_mode can just always be made visible and not depend on PLZA
>> capable hardware. This means that on Intel and Arm this file can show:
>>
>> # cat info/kernel_mode
>> [inherit_ctrl_and_mon]
>>
>
> Yes. Sure.
>
>
>> For Intel this is accurate and also for Arm if I interpret the Arm implementation correctly
>> (see mpam_thread_switch()) in https://lore.kernel.org/lkml/20260313144617.3420416-7-ben.horgan@arm.com/
>>
>>>
>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>>>
>>> # cat info/kernel_mode
>>> global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>> global_assign_ctrl_inherit_mon_per_cpu
>>>
>>>
>>> # echo "global_assign_ctrl_inherit_mon_per_cpu:group=//" > info/kernel_mode
>>>
>>>
>>> # cat info/kernel_mode
>>> global_assign_ctrl_assign_mon_per_cpu
>>> global_assign_ctrl_inherit_mon_per_cpu:group=//
>>>
>>>
>>> How does this look?
>>
>> In addition to above I think it will be helpful to add a clear indication to user
>> space on what the current active mode is, for example, via the [] characters.
>
> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>
> # cat info/kernel_mode
> inherit_ctrl_and_mon
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
> [global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
>
> Something like this?
How about making it clear that the whole line/configuration is active, like below:
# cat info/kernel_mode
inherit_ctrl_and_mon
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
>
> There is one problem here. The mode "inherit_ctrl_and_mon" listing not consistent with others.
It is difficult to predict what resctrl will be asked to support next. One possibility here is
to make it part of the original design that the first field is the "mode" and the following field
contains that mode's global properties of which there could be more than one. Above shows that
the two "global" modes have a single global property but we could just try to be safe with some
documentation that states there could be more.
Consider for example some hypothetical future where the file looks like:
# cat info/kernel_mode
inherit_ctrl_and_mon:some_unique_capability=true
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized;other_property=val
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
To leave room for growth the file could start out by, for example, appending ":"
to "inherit_ctrl_and_mon" to indicate that there are no known properties yet? Something like
below. Would this be more consistent with the others?
# cat info/kernel_mode
inherit_ctrl_and_mon:
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Babu Moger @ 2026-04-21 16:46 UTC (permalink / raw)
To: Reinette Chatre, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <8d969f11-4a7f-4e36-b85a-c3ed714fc603@intel.com>
Hi Reinette,
On 4/21/26 11:15, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/21/26 8:08 AM, Babu Moger wrote:
>> Hi Reinette,
>>
>> On 4/20/26 22:17, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/20/26 5:40 PM, Moger, Babu wrote:
>>>>
>>>> We already discussed moving back to the default group on every mode
>>>> switch. Doing so here would once again cause extra MSR writes on
>>>> each mode transition, which is undesirable.
>>>>
>>>
>>> Needing to avoid extra MSR writes in resctrl is not so absolute. Consider, for
>>> example, how resctrl initializes default allocations when a new resource group is
>>> created. resctrl aims to initialize with sane defaults and the user is expected to
>>> follow with desired allocations.
>>>
>>> I am not against optimizing, I just want to be careful with such general statements.
>>>
>>> Considering your proposal in https://lore.kernel.org/lkml/39e0c786-cc35-4555-bfb9-ff7cd758c423@amd.com/:
>>>
>>> I do not think we should make info/kernel_mode read-only. If I understand correctly
>>> doing so would accommodate AMD PLZA but it ignores the discussions on how resctrl could
>>> support MPAM ... or do you perhaps have proposal on how MPAM can be supported when considering
>>> your proposal? Even if you do not want to consider MPAM - what if the PLZA_PQR register's
>>> scope becomes per-CPU in the next version of AMD PLZA?
>>>
>>> The idea behind info/kernel_mode is that the active mode it identifies indicates which
>>> configuration files exist to configure the active mode. Since the mode may not always
>>> depend on global configuration, for which info/kernel_mode_assignment was created, but instead
>>> rely on per-resource group files, I do not see how resctrl can build on a read-only
>>> info/kernel_mode backed by a mode and group change via info/kernel_mode_assignment.
>>> Specifically, MPAM support may not use info/kernel_mode_assignment at all.
>>> Instead, MPAM may use something like described in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@e134344.arm.com/
>>>
>>> Could we perhaps consider dropping info/kernel_mode_assignment entirely for
>>> AMD PLZA's global allocations? Similar to what you suggest, the mode and
>>> group assignment could be done via the info/kernel_mode file instead?
>>>
>>> Thinking about this more since the CPUs allocation is global, these could *theoretically*
>>> be included also (but see later).
>>> This could mean that "kernel_mode_cpus" and "kernel_mode_cpus_list" could be dropped?
>>> Although, this may complicate the interface since user space may want a convenient way
>>> to modify just CPUs independently from needing to repeat the mode and group every time.
>>>
>>> Consider, for example:
>>>
>>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/;cpus_list=5-8" > info/kernel_mode
>>
>> This looks reasonable.
>>
>>>
>>> Having named fields (a) makes this extensible, (b) output does not need to be split among files,
>>> and (c) "inherit_ctrl_and_mon" can continue to be supported.
>>>
>>> The named fields could be made optional, if group is omitted then it will become the
>>> default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
>>> This may not be intuitive since a user may expect that not mentioning a field means
>>> that the field is left untouched. Have you considered this scenario in your proposal?
>>>
>>> As an alternative the group could be made a required field and "kernel_mode_cpus"/"kernel_mode_cpuslist"
>>> can stay? This may be the simplest approach.
>>
>> How about keeping a single option to update the CPUs using
>> kernel_mode_cpus / kernel_mode_cpuslist within the group?
>>
>> Should we consider removing the per‑CPU extension altogether? By
>> default, the mode already applies to all online CPUs, and any
>> per‑CPU requirements can be handled within the group using
>> kernel_mode_cpus / kernel_mode_cpuslist.
>
> It sounds like we are saying the same thing?
> When considering all the sharp corners I agree that keeping kernel_mode_cpus/kernel_mode_cpuslist
> seems most user friendly. When doing so there is no need to include CPU assignment in the global
> files.
Actually, I was talking about removing _per_cpu extension also as the
per-CPU requirement is handled inside the group using
kernel_mode_cpus/kernel_mode_cpuslist. It can be documented.
global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_assign_mon
global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_inherit_mon
>>
>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>>
>> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
>>
>>
>> System boots up with following options if PLZA is supported.
>>
>> # cat info/kernel_mode
>> global_assign_ctrl_assign_mon_per_cpu
>> global_assign_ctrl_inherit_mon_per_cpu
>>
>> No groups are associated with kernel mode at this point.
>
> To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
> above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
> we will need to add documentation that describes to user space how this file should be interpreted.
> It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
>
> I think something like below will be more intuitive and not need much additional
> documentation to understand (I am just adding the "uninitialized" as an example to match text
> printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
> the lack of "/" could be used to make it clear what this means?):
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
> global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
> global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
>
Sounds ok to me.
> I also think an interface like this would be simpler for user space to use as it (user space) switches
> between PLZA capable and non-PLZA capable systems since user space need not associate existence of
> the file with some kernel mode state in addition to actual content of the file when it does exist.
>
> I assumed that info/kernel_mode can just always be made visible and not depend on PLZA
> capable hardware. This means that on Intel and Arm this file can show:
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
>
Yes. Sure.
> For Intel this is accurate and also for Arm if I interpret the Arm implementation correctly
> (see mpam_thread_switch()) in https://lore.kernel.org/lkml/20260313144617.3420416-7-ben.horgan@arm.com/
>
>>
>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>>
>> # cat info/kernel_mode
>> global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>> global_assign_ctrl_inherit_mon_per_cpu
>>
>>
>> # echo "global_assign_ctrl_inherit_mon_per_cpu:group=//" > info/kernel_mode
>>
>>
>> # cat info/kernel_mode
>> global_assign_ctrl_assign_mon_per_cpu
>> global_assign_ctrl_inherit_mon_per_cpu:group=//
>>
>>
>> How does this look?
>
> In addition to above I think it will be helpful to add a clear indication to user
> space on what the current active mode is, for example, via the [] characters.
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" >
info/kernel_mode
# cat info/kernel_mode
inherit_ctrl_and_mon
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
[global_assign_ctrl_assign_mon_per_cpu]:group=ctrl1/mon1/
Something like this?
There is one problem here. The mode "inherit_ctrl_and_mon" listing not
consistent with others.
Thanks
Babu
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 16:15 UTC (permalink / raw)
To: Babu Moger, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <e624f652-f0a6-4926-a0ab-c4486d41eb6d@amd.com>
Hi Babu,
On 4/21/26 8:08 AM, Babu Moger wrote:
> Hi Reinette,
>
> On 4/20/26 22:17, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/20/26 5:40 PM, Moger, Babu wrote:
>>>
>>> We already discussed moving back to the default group on every mode
>>> switch. Doing so here would once again cause extra MSR writes on
>>> each mode transition, which is undesirable.
>>>
>>
>> Needing to avoid extra MSR writes in resctrl is not so absolute. Consider, for
>> example, how resctrl initializes default allocations when a new resource group is
>> created. resctrl aims to initialize with sane defaults and the user is expected to
>> follow with desired allocations.
>>
>> I am not against optimizing, I just want to be careful with such general statements.
>>
>> Considering your proposal in https://lore.kernel.org/lkml/39e0c786-cc35-4555-bfb9-ff7cd758c423@amd.com/:
>>
>> I do not think we should make info/kernel_mode read-only. If I understand correctly
>> doing so would accommodate AMD PLZA but it ignores the discussions on how resctrl could
>> support MPAM ... or do you perhaps have proposal on how MPAM can be supported when considering
>> your proposal? Even if you do not want to consider MPAM - what if the PLZA_PQR register's
>> scope becomes per-CPU in the next version of AMD PLZA?
>>
>> The idea behind info/kernel_mode is that the active mode it identifies indicates which
>> configuration files exist to configure the active mode. Since the mode may not always
>> depend on global configuration, for which info/kernel_mode_assignment was created, but instead
>> rely on per-resource group files, I do not see how resctrl can build on a read-only
>> info/kernel_mode backed by a mode and group change via info/kernel_mode_assignment.
>> Specifically, MPAM support may not use info/kernel_mode_assignment at all.
>> Instead, MPAM may use something like described in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@e134344.arm.com/
>>
>> Could we perhaps consider dropping info/kernel_mode_assignment entirely for
>> AMD PLZA's global allocations? Similar to what you suggest, the mode and
>> group assignment could be done via the info/kernel_mode file instead?
>>
>> Thinking about this more since the CPUs allocation is global, these could *theoretically*
>> be included also (but see later).
>> This could mean that "kernel_mode_cpus" and "kernel_mode_cpus_list" could be dropped?
>> Although, this may complicate the interface since user space may want a convenient way
>> to modify just CPUs independently from needing to repeat the mode and group every time.
>>
>> Consider, for example:
>>
>> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/;cpus_list=5-8" > info/kernel_mode
>
> This looks reasonable.
>
>>
>> Having named fields (a) makes this extensible, (b) output does not need to be split among files,
>> and (c) "inherit_ctrl_and_mon" can continue to be supported.
>>
>> The named fields could be made optional, if group is omitted then it will become the
>> default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
>> This may not be intuitive since a user may expect that not mentioning a field means
>> that the field is left untouched. Have you considered this scenario in your proposal?
>>
>> As an alternative the group could be made a required field and "kernel_mode_cpus"/"kernel_mode_cpuslist"
>> can stay? This may be the simplest approach.
>
> How about keeping a single option to update the CPUs using
> kernel_mode_cpus / kernel_mode_cpuslist within the group?
>
> Should we consider removing the per‑CPU extension altogether? By
> default, the mode already applies to all online CPUs, and any
> per‑CPU requirements can be handled within the group using
> kernel_mode_cpus / kernel_mode_cpuslist.
It sounds like we are saying the same thing?
When considering all the sharp corners I agree that keeping kernel_mode_cpus/kernel_mode_cpuslist
seems most user friendly. When doing so there is no need to include CPU assignment in the global
files.
>
> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
>
> Why do we still need to keep the "inherit_ctrl_and_mon"? By default all the groups in the system falls in this category it is not plza enabled group.
>
>
> System boots up with following options if PLZA is supported.
>
> # cat info/kernel_mode
> global_assign_ctrl_assign_mon_per_cpu
> global_assign_ctrl_inherit_mon_per_cpu
>
> No groups are associated with kernel mode at this point.
To me it seems useful to be clear to user space on what the current mode is. If I understand correctly
above default scenario essentially means "inherit_ctrl_and_mon" but instead of adding it to this file
we will need to add documentation that describes to user space how this file should be interpreted.
It seems easier to me to just be clear via info/kernel_mode itself on what the current active mode is?
I think something like below will be more intuitive and not need much additional
documentation to understand (I am just adding the "uninitialized" as an example to match text
printed in schemata file during pseudo-locking ... even if there is a group named "uninitialized"
the lack of "/" could be used to make it clear what this means?):
# cat info/kernel_mode
[inherit_ctrl_and_mon]
global_assign_ctrl_assign_mon_per_cpu:group=uninitialized
global_assign_ctrl_inherit_mon_per_cpu:group=uninitialized
I also think an interface like this would be simpler for user space to use as it (user space) switches
between PLZA capable and non-PLZA capable systems since user space need not associate existence of
the file with some kernel mode state in addition to actual content of the file when it does exist.
I assumed that info/kernel_mode can just always be made visible and not depend on PLZA
capable hardware. This means that on Intel and Arm this file can show:
# cat info/kernel_mode
[inherit_ctrl_and_mon]
For Intel this is accurate and also for Arm if I interpret the Arm implementation correctly
(see mpam_thread_switch()) in https://lore.kernel.org/lkml/20260313144617.3420416-7-ben.horgan@arm.com/
>
> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" > info/kernel_mode
>
> # cat info/kernel_mode
> global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
> global_assign_ctrl_inherit_mon_per_cpu
>
>
> # echo "global_assign_ctrl_inherit_mon_per_cpu:group=//" > info/kernel_mode
>
>
> # cat info/kernel_mode
> global_assign_ctrl_assign_mon_per_cpu
> global_assign_ctrl_inherit_mon_per_cpu:group=//
>
>
> How does this look?
In addition to above I think it will be helpful to add a clear indication to user
space on what the current active mode is, for example, via the [] characters.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 16:12 UTC (permalink / raw)
To: Luck, Tony
Cc: Moger, Babu, Babu Moger, corbet@lwn.net, Dave.Martin@arm.com,
james.morse@arm.com, tglx@kernel.org, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com,
skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <aeeTnL3aisKgPJG-@agluck-desk3>
Hi Tony,
On 4/21/26 8:11 AM, Luck, Tony wrote:
> On Mon, Apr 20, 2026 at 05:21:50PM -0700, Reinette Chatre wrote:
>> On 4/20/26 5:03 PM, Luck, Tony wrote:
...
>>>> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
>>>
>>> This mode needs a CLOSID for PLZA, but doesn't need an RMID.
>>>
>>>> At this stage, only the kernel mode is being changed. However, there is no
>>>> way to know which control group the user intends to assign to kernel mode.
>>>> All we know here is the selected mode.
>>>>
>>>> After this operation, the info/kernel_mode_assignment interface should
>>>> become visible. But the question is: what should it contain or point to at
>>>> this moment?
>>>>
>>>> # cat info/kernel_mode_assignment
>>>> ??
>>>>
>>>> Next operation: Assign the group
>>>>
>>>> # echo "ctrl1//" > info/kernel_mode_assignment
>>>
>>> Now ring0 code is using the CLOSID from the ctrl1 group.
>>
>> ... and user space tasks also continue to use the CLOSID from the
>> ctrl1 group.
>> It is up to user space to decide if a group is dedicated to kernel
>> mode or not. resctrl does not enforce it.
>>
>>>
>>> But the RMID for this group isn't used.
>>
>> RMID is still used by user mode that maintains existing behavior concerning
>> this group when considering its tasks/cpus/cpus_list files. RMID assigned to this
>> group is just not used for kernel mode.
>
> True, that the RMID is used if the user makes assignments using tasks/cpus/cpus_list
> for the ctrl1 group. But they might not do that.
>
>>
>>>
>>> Are we OK with "wasting" an RMID in this way?
>>
>> How do you see this RMID as "wasted"?
>
> Suppose the user doesn't assign tasks to the ctrl1 group?
>
> Perhaps the resources they want to make available to the kernel do
> not exactly match with resources that they want to provide to any
> tasks. In this case the RMID is wasted.
Under these circumstances, yes, the RMID will not be used.
A related scenario (when considering what may happen if user does not assign tasks to
the ctrl1 group) is when user space disables PLZA on all CPUs in a domain then the CLOSID
(as well as RMID since this is irrespective of rmid_en mode) associated with kernel_mode
will be unused in that domain.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Luck, Tony @ 2026-04-21 15:11 UTC (permalink / raw)
To: Reinette Chatre
Cc: Moger, Babu, Babu Moger, corbet@lwn.net, Dave.Martin@arm.com,
james.morse@arm.com, tglx@kernel.org, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com,
skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <58b8fe0c-80f6-4ba2-abbe-90d0ceee6daa@intel.com>
On Mon, Apr 20, 2026 at 05:21:50PM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 4/20/26 5:03 PM, Luck, Tony wrote:
> >> The system boots with these default settings:
> >>
> >> # cat info/kernel_mode
> >> [inherit_ctrl_and_mon]
> >> global_assign_ctrl_assign_mon_per_cpu
> >> global_assign_ctrl_inherit_mon_per_cpu
> >>
> >>
> >> At this point, the interface info/kernel_mode_assignment is not visible.
> >>
> >> Next, lets create a new control group:
> >>
> >> # mkdir ctrl1
> >
> > This allocates a CLOSID and an RMID for this group.
> >
> >> We want to designate this group as the new kernel-mode group.
> >>
> >> First operation: Change the mode:
> >>
> >> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
> >
> > This mode needs a CLOSID for PLZA, but doesn't need an RMID.
> >
> >> At this stage, only the kernel mode is being changed. However, there is no
> >> way to know which control group the user intends to assign to kernel mode.
> >> All we know here is the selected mode.
> >>
> >> After this operation, the info/kernel_mode_assignment interface should
> >> become visible. But the question is: what should it contain or point to at
> >> this moment?
> >>
> >> # cat info/kernel_mode_assignment
> >> ??
> >>
> >> Next operation: Assign the group
> >>
> >> # echo "ctrl1//" > info/kernel_mode_assignment
> >
> > Now ring0 code is using the CLOSID from the ctrl1 group.
>
> ... and user space tasks also continue to use the CLOSID from the
> ctrl1 group.
> It is up to user space to decide if a group is dedicated to kernel
> mode or not. resctrl does not enforce it.
>
> >
> > But the RMID for this group isn't used.
>
> RMID is still used by user mode that maintains existing behavior concerning
> this group when considering its tasks/cpus/cpus_list files. RMID assigned to this
> group is just not used for kernel mode.
True, that the RMID is used if the user makes assignments using tasks/cpus/cpus_list
for the ctrl1 group. But they might not do that.
>
> >
> > Are we OK with "wasting" an RMID in this way?
>
> How do you see this RMID as "wasted"?
Suppose the user doesn't assign tasks to the ctrl1 group?
Perhaps the resources they want to make available to the kernel do
not exactly match with resources that they want to provide to any
tasks. In this case the RMID is wasted.
> >
> > Maybe it doesn't matter too much for AMD as you would just
> > avoid assigning any counters to this group. But should Intel
> > get around to doing PLZA-like functionality, that's a real
> > loss of an RMID that might be useful elsewhere.
>
> Reinette
-Tony
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Babu Moger @ 2026-04-21 15:08 UTC (permalink / raw)
To: Reinette Chatre, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <71099958-1ddf-40dc-8a3c-aa13d0c56fee@intel.com>
Hi Reinette,
On 4/20/26 22:17, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/20/26 5:40 PM, Moger, Babu wrote:
>>
>> We already discussed moving back to the default group on every mode
>> switch. Doing so here would once again cause extra MSR writes on
>> each mode transition, which is undesirable.
>>
>
> Needing to avoid extra MSR writes in resctrl is not so absolute. Consider, for
> example, how resctrl initializes default allocations when a new resource group is
> created. resctrl aims to initialize with sane defaults and the user is expected to
> follow with desired allocations.
>
> I am not against optimizing, I just want to be careful with such general statements.
>
> Considering your proposal in https://lore.kernel.org/lkml/39e0c786-cc35-4555-bfb9-ff7cd758c423@amd.com/:
>
> I do not think we should make info/kernel_mode read-only. If I understand correctly
> doing so would accommodate AMD PLZA but it ignores the discussions on how resctrl could
> support MPAM ... or do you perhaps have proposal on how MPAM can be supported when considering
> your proposal? Even if you do not want to consider MPAM - what if the PLZA_PQR register's
> scope becomes per-CPU in the next version of AMD PLZA?
>
> The idea behind info/kernel_mode is that the active mode it identifies indicates which
> configuration files exist to configure the active mode. Since the mode may not always
> depend on global configuration, for which info/kernel_mode_assignment was created, but instead
> rely on per-resource group files, I do not see how resctrl can build on a read-only
> info/kernel_mode backed by a mode and group change via info/kernel_mode_assignment.
> Specifically, MPAM support may not use info/kernel_mode_assignment at all.
> Instead, MPAM may use something like described in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@e134344.arm.com/
>
> Could we perhaps consider dropping info/kernel_mode_assignment entirely for
> AMD PLZA's global allocations? Similar to what you suggest, the mode and
> group assignment could be done via the info/kernel_mode file instead?
>
> Thinking about this more since the CPUs allocation is global, these could *theoretically*
> be included also (but see later).
> This could mean that "kernel_mode_cpus" and "kernel_mode_cpus_list" could be dropped?
> Although, this may complicate the interface since user space may want a convenient way
> to modify just CPUs independently from needing to repeat the mode and group every time.
>
> Consider, for example:
>
> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/;cpus_list=5-8" > info/kernel_mode
This looks reasonable.
>
> Having named fields (a) makes this extensible, (b) output does not need to be split among files,
> and (c) "inherit_ctrl_and_mon" can continue to be supported.
>
> The named fields could be made optional, if group is omitted then it will become the
> default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
> This may not be intuitive since a user may expect that not mentioning a field means
> that the field is left untouched. Have you considered this scenario in your proposal?
>
> As an alternative the group could be made a required field and "kernel_mode_cpus"/"kernel_mode_cpuslist"
> can stay? This may be the simplest approach.
How about keeping a single option to update the CPUs using
kernel_mode_cpus / kernel_mode_cpuslist within the group?
Should we consider removing the per‑CPU extension altogether? By
default, the mode already applies to all online CPUs, and any per‑CPU
requirements can be handled within the group using kernel_mode_cpus /
kernel_mode_cpuslist.
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
Why do we still need to keep the "inherit_ctrl_and_mon"? By default all
the groups in the system falls in this category it is not plza enabled
group.
System boots up with following options if PLZA is supported.
# cat info/kernel_mode
global_assign_ctrl_assign_mon_per_cpu
global_assign_ctrl_inherit_mon_per_cpu
No groups are associated with kernel mode at this point.
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" >
info/kernel_mode
# cat info/kernel_mode
global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/
global_assign_ctrl_inherit_mon_per_cpu
# echo "global_assign_ctrl_inherit_mon_per_cpu:group=//" > info/kernel_mode
# cat info/kernel_mode
global_assign_ctrl_assign_mon_per_cpu
global_assign_ctrl_inherit_mon_per_cpu:group=//
How does this look?
Thanks
Babu
^ permalink raw reply
* [PATCH v13 00/48] arm64: Support for Arm CCA in KVM
From: Jiahao zheng @ 2026-04-21 13:51 UTC (permalink / raw)
To: steven.price
Cc: alexandru.elisei, alpergun, aneesh.kumar, catalin.marinas,
christoffer.dall, fj0570is, gankulkarni, gshan, james.morse,
joey.gouly, kvm, kvmarm, linux-arm-kernel, linux-coco,
linux-kernel, maz, oliver.upton, sdonthineni, suzuki.poulose,
tabba, vannapurve, will, yuzenghui
In-Reply-To: <20260318155413.793430-1-steven.price@arm.com>
Hi Steven,
I've been testing CCA patch series and noticed Realm VM cannot boot successfully when the host is forced to run in nVHE mode (e.g., via `kvm-arm.mode=nvhe`). The kvmtool debug information will be truncated in set_guest_bank_private_gpa.
Currently, in `kvm_ioctl_vcpu_run()`, running a Realm VM (REC) bypasses the standard nVHE EL2 stub. `kvm_rec_enter()` directly executes the SMC instruction to transition to the RMM. Upon returning to the EL1 host, the code falls back to `kvm_vgic_sync_hwstate()`, where the VGIC save operation is explicitly skipped for nVHE. Since the EL2 stub was bypassed, `__vgic_v3_save_state()` is never executed, and `ICH_*_EL2` states are lost.
To resolve this, I have a couple of thoughts:
1. If Host nVHE mode is not intended to be supported for Realms:
Since RME implies ARMv9 which mandates VHE, running a Realm with an nVHE host might just be an unsupported edge case. If so, we should explicitly reject RME initialization or REC creation when `!is_kernel_in_hyp_mode()`. This would cleanly prevent the undefined behavior.
2. If Host nVHE mode is intended to be supported:
Since RMM should remain agnostic to the Non-Secure VGIC states, the burden of saving these states falls strictly on KVM. However, the EL1 host cannot access `ICH_*_EL2`. Therefore, KVM needs to add specific logic for this scenario. We would likely need to route the REC exit through a dedicated nVHE EL2 stub to invoke `__vgic_v3_save_state()` before dropping back to EL1, rather than jumping straight back to `kvm_ioctl_vcpu_run()`.
I might have missed some documentation or comments regarding nVHE restrictions for CCA. If this is an oversight, it would be great to see a check added in the next iteration of the series.
Thanks,
Zheng
^ permalink raw reply
* Re: [PATCH v5 1/2] dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory
From: Jason Gunthorpe @ 2026-04-21 12:10 UTC (permalink / raw)
To: Jiri Pirko
Cc: Aneesh Kumar K.V, dri-devel, linaro-mm-sig, iommu, linux-media,
sumit.semwal, benjamin.gaignard, Brian.Starkey, jstultz,
tjmercier, christian.koenig, m.szyprowski, robin.murphy, leon,
sean.anderson, ptesarik, catalin.marinas, suzuki.poulose,
steven.price, thomas.lendacky, john.allen, ashish.kalra,
suravee.suthikulpanit, linux-coco
In-Reply-To: <tteiecxfqy4k24wnzvp6ocxnuopyhmqtne2xwh5htwldlbzjnp@o6cbzdlurxld>
On Tue, Apr 21, 2026 at 01:53:31PM +0200, Jiri Pirko wrote:
> >> You reach there when is_swiotlb_force_bounce(dev) is true and
> >> DMA_ATTR_CC_SHARED is set. What am I missing?
> >>
> >
> >So a swiotlb_force_bounce will not use swiotlb bouncing if
> >DMA_ATTR_CC_SHARED is set ?
>
> Correct. Bouncing does not make sense in this case, as shared memory is
> already being mapped.
It is a little bit mangled, there are many reasons force_swiotlb can
be set, but we loose them as it flows through - swiotlb_init()
just has a simple SWIOTLB_FORCE
Ideally DMA_ATTR_CC_SHARED would skip swiotlb only if it is being
selected for CC reasons. For instance if you have the swiotlb force
command line parameter I would still expect it bounce shared memory.
Arguably I think this arch flow is misdesigned, the
is_swiotlb_force_bounce() should not be used for CC. dma_capable() is
the correct API to check if the device can DMA to the presented
address, and it will trigger swiotlb_map() just the same without
creating this gap.
Jason
^ permalink raw reply
* Re: [PATCH v5 1/2] dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory
From: Jiri Pirko @ 2026-04-21 11:53 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: dri-devel, linaro-mm-sig, iommu, linux-media, sumit.semwal,
benjamin.gaignard, Brian.Starkey, jstultz, tjmercier,
christian.koenig, m.szyprowski, robin.murphy, jgg, leon,
sean.anderson, ptesarik, catalin.marinas, suzuki.poulose,
steven.price, thomas.lendacky, john.allen, ashish.kalra,
suravee.suthikulpanit, linux-coco
In-Reply-To: <yq5awly0d504.fsf@kernel.org>
Tue, Apr 21, 2026 at 11:42:03AM +0200, aneesh.kumar@kernel.org wrote:
>Jiri Pirko <jiri@resnulli.us> writes:
>
>> Mon, Apr 20, 2026 at 08:34:06AM +0200, aneesh.kumar@kernel.org wrote:
>>>Jiri Pirko <jiri@resnulli.us> writes:
>>>
>>>> From: Jiri Pirko <jiri@nvidia.com>
>>>>
>>>> Current CC designs don't place a vIOMMU in front of untrusted devices.
>>>> Instead, the DMA API forces all untrusted device DMA through swiotlb
>>>> bounce buffers (is_swiotlb_force_bounce()) which copies data into
>>>> shared memory on behalf of the device.
>>>>
>>>> When a caller has already arranged for the memory to be shared
>>>> via set_memory_decrypted(), the DMA API needs to know so it can map
>>>> directly using the unencrypted physical address rather than bounce
>>>> buffering. Following the pattern of DMA_ATTR_MMIO, add
>>>> DMA_ATTR_CC_SHARED for this purpose. Like the MMIO case, only the
>>>> caller knows what kind of memory it has and must inform the DMA API
>>>> for it to work correctly.
>>>>
>>>> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
>>>> ---
>>>> v4->v5:
>>>> - rebased on top od dma-mapping-for-next
>>>> - s/decrypted/shared/
>>>> v3->v4:
>>>> - added some sanity checks to dma_map_phys and dma_unmap_phys
>>>> - enhanced documentation of DMA_ATTR_CC_DECRYPTED attr
>>>> v1->v2:
>>>> - rebased on top of recent dma-mapping-fixes
>>>> ---
>>>> include/linux/dma-mapping.h | 10 ++++++++++
>>>> include/trace/events/dma.h | 3 ++-
>>>> kernel/dma/direct.h | 14 +++++++++++---
>>>> kernel/dma/mapping.c | 13 +++++++++++--
>>>> 4 files changed, 34 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
>>>> index 677c51ab7510..db8ab24a54f4 100644
>>>> --- a/include/linux/dma-mapping.h
>>>> +++ b/include/linux/dma-mapping.h
>>>> @@ -92,6 +92,16 @@
>>>> * flushing.
>>>> */
>>>> #define DMA_ATTR_REQUIRE_COHERENT (1UL << 12)
>>>> +/*
>>>> + * DMA_ATTR_CC_SHARED: Indicates the DMA mapping is shared (decrypted) for
>>>> + * confidential computing guests. For normal system memory the caller must have
>>>> + * called set_memory_decrypted(), and pgprot_decrypted must be used when
>>>> + * creating CPU PTEs for the mapping. The same shared semantic may be passed
>>>> + * to the vIOMMU when it sets up the IOPTE. For MMIO use together with
>>>> + * DMA_ATTR_MMIO to indicate shared MMIO. Unless DMA_ATTR_MMIO is provided
>>>> + * a struct page is required.
>>>> + */
>>>> +#define DMA_ATTR_CC_SHARED (1UL << 13)
>>>>
>>>> /*
>>>> * A dma_addr_t can hold any valid DMA or bus address for the platform. It can
>>>> diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
>>>> index 63597b004424..31c9ddf72c9d 100644
>>>> --- a/include/trace/events/dma.h
>>>> +++ b/include/trace/events/dma.h
>>>> @@ -34,7 +34,8 @@ TRACE_DEFINE_ENUM(DMA_NONE);
>>>> { DMA_ATTR_PRIVILEGED, "PRIVILEGED" }, \
>>>> { DMA_ATTR_MMIO, "MMIO" }, \
>>>> { DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, "CACHELINES_OVERLAP" }, \
>>>> - { DMA_ATTR_REQUIRE_COHERENT, "REQUIRE_COHERENT" })
>>>> + { DMA_ATTR_REQUIRE_COHERENT, "REQUIRE_COHERENT" }, \
>>>> + { DMA_ATTR_CC_SHARED, "CC_SHARED" })
>>>>
>>>> DECLARE_EVENT_CLASS(dma_map,
>>>> TP_PROTO(struct device *dev, phys_addr_t phys_addr, dma_addr_t dma_addr,
>>>> diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
>>>> index b86ff65496fc..7140c208c123 100644
>>>> --- a/kernel/dma/direct.h
>>>> +++ b/kernel/dma/direct.h
>>>> @@ -89,16 +89,24 @@ static inline dma_addr_t dma_direct_map_phys(struct device *dev,
>>>> dma_addr_t dma_addr;
>>>>
>>>> if (is_swiotlb_force_bounce(dev)) {
>>>> - if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT))
>>>> - return DMA_MAPPING_ERROR;
>>>> + if (!(attrs & DMA_ATTR_CC_SHARED)) {
>>>> + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT))
>>>> + return DMA_MAPPING_ERROR;
>>>>
>>>> - return swiotlb_map(dev, phys, size, dir, attrs);
>>>> + return swiotlb_map(dev, phys, size, dir, attrs);
>>>> + }
>>>> + } else if (attrs & DMA_ATTR_CC_SHARED) {
>>>> + return DMA_MAPPING_ERROR;
>>>> }
>>>>
>>>
>>>What is this check for? If we are requesting a DMA mapping with
>>>DMA_ATTR_CC_SHARED, shouldn’t it be allowed? If not, how would we reach
>>
>> This is defensive. Only allows to map with DMA_ATTR_CC_SHARED set to
>> dev dev that does not support CC natively. This can be of course lifted,
>> if you have a case.
>>
>>
>>>the conditional below where we convert the physical address to a DMA
>>>address using phys_to_dma_unencrypted()?. Also, how is this supposed to
>>>interact with is_swiotlb_force_bounce()?”
>>
>> You reach there when is_swiotlb_force_bounce(dev) is true and
>> DMA_ATTR_CC_SHARED is set. What am I missing?
>>
>
>So a swiotlb_force_bounce will not use swiotlb bouncing if
>DMA_ATTR_CC_SHARED is set ?
Correct. Bouncing does not make sense in this case, as shared memory is
already being mapped.
>
>>
>>
>>>
>>>>
>>>> if (attrs & DMA_ATTR_MMIO) {
>>>> dma_addr = phys;
>>>> if (unlikely(!dma_capable(dev, dma_addr, size, false)))
>>>> goto err_overflow;
>>>> + } else if (attrs & DMA_ATTR_CC_SHARED) {
>>>> + dma_addr = phys_to_dma_unencrypted(dev, phys);
>>>> + if (unlikely(!dma_capable(dev, dma_addr, size, false)))
>>>> + goto err_overflow;
>>>> } else {
>>>> dma_addr = phys_to_dma(dev, phys);
>>>> if (unlikely(!dma_capable(dev, dma_addr, size, true)) ||
>>>
>
>-aneesh
^ permalink raw reply
* Re: [PATCH v5 1/2] dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory
From: Aneesh Kumar K.V @ 2026-04-21 9:42 UTC (permalink / raw)
To: Jiri Pirko
Cc: dri-devel, linaro-mm-sig, iommu, linux-media, sumit.semwal,
benjamin.gaignard, Brian.Starkey, jstultz, tjmercier,
christian.koenig, m.szyprowski, robin.murphy, jgg, leon,
sean.anderson, ptesarik, catalin.marinas, suzuki.poulose,
steven.price, thomas.lendacky, john.allen, ashish.kalra,
suravee.suthikulpanit, linux-coco
In-Reply-To: <4qdizkkoeke3cvkcf35upa7p7ick6s654eqlrizmi7ozkw5eze@tnpk2e34xgwl>
Jiri Pirko <jiri@resnulli.us> writes:
> Mon, Apr 20, 2026 at 08:34:06AM +0200, aneesh.kumar@kernel.org wrote:
>>Jiri Pirko <jiri@resnulli.us> writes:
>>
>>> From: Jiri Pirko <jiri@nvidia.com>
>>>
>>> Current CC designs don't place a vIOMMU in front of untrusted devices.
>>> Instead, the DMA API forces all untrusted device DMA through swiotlb
>>> bounce buffers (is_swiotlb_force_bounce()) which copies data into
>>> shared memory on behalf of the device.
>>>
>>> When a caller has already arranged for the memory to be shared
>>> via set_memory_decrypted(), the DMA API needs to know so it can map
>>> directly using the unencrypted physical address rather than bounce
>>> buffering. Following the pattern of DMA_ATTR_MMIO, add
>>> DMA_ATTR_CC_SHARED for this purpose. Like the MMIO case, only the
>>> caller knows what kind of memory it has and must inform the DMA API
>>> for it to work correctly.
>>>
>>> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
>>> ---
>>> v4->v5:
>>> - rebased on top od dma-mapping-for-next
>>> - s/decrypted/shared/
>>> v3->v4:
>>> - added some sanity checks to dma_map_phys and dma_unmap_phys
>>> - enhanced documentation of DMA_ATTR_CC_DECRYPTED attr
>>> v1->v2:
>>> - rebased on top of recent dma-mapping-fixes
>>> ---
>>> include/linux/dma-mapping.h | 10 ++++++++++
>>> include/trace/events/dma.h | 3 ++-
>>> kernel/dma/direct.h | 14 +++++++++++---
>>> kernel/dma/mapping.c | 13 +++++++++++--
>>> 4 files changed, 34 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
>>> index 677c51ab7510..db8ab24a54f4 100644
>>> --- a/include/linux/dma-mapping.h
>>> +++ b/include/linux/dma-mapping.h
>>> @@ -92,6 +92,16 @@
>>> * flushing.
>>> */
>>> #define DMA_ATTR_REQUIRE_COHERENT (1UL << 12)
>>> +/*
>>> + * DMA_ATTR_CC_SHARED: Indicates the DMA mapping is shared (decrypted) for
>>> + * confidential computing guests. For normal system memory the caller must have
>>> + * called set_memory_decrypted(), and pgprot_decrypted must be used when
>>> + * creating CPU PTEs for the mapping. The same shared semantic may be passed
>>> + * to the vIOMMU when it sets up the IOPTE. For MMIO use together with
>>> + * DMA_ATTR_MMIO to indicate shared MMIO. Unless DMA_ATTR_MMIO is provided
>>> + * a struct page is required.
>>> + */
>>> +#define DMA_ATTR_CC_SHARED (1UL << 13)
>>>
>>> /*
>>> * A dma_addr_t can hold any valid DMA or bus address for the platform. It can
>>> diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
>>> index 63597b004424..31c9ddf72c9d 100644
>>> --- a/include/trace/events/dma.h
>>> +++ b/include/trace/events/dma.h
>>> @@ -34,7 +34,8 @@ TRACE_DEFINE_ENUM(DMA_NONE);
>>> { DMA_ATTR_PRIVILEGED, "PRIVILEGED" }, \
>>> { DMA_ATTR_MMIO, "MMIO" }, \
>>> { DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, "CACHELINES_OVERLAP" }, \
>>> - { DMA_ATTR_REQUIRE_COHERENT, "REQUIRE_COHERENT" })
>>> + { DMA_ATTR_REQUIRE_COHERENT, "REQUIRE_COHERENT" }, \
>>> + { DMA_ATTR_CC_SHARED, "CC_SHARED" })
>>>
>>> DECLARE_EVENT_CLASS(dma_map,
>>> TP_PROTO(struct device *dev, phys_addr_t phys_addr, dma_addr_t dma_addr,
>>> diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
>>> index b86ff65496fc..7140c208c123 100644
>>> --- a/kernel/dma/direct.h
>>> +++ b/kernel/dma/direct.h
>>> @@ -89,16 +89,24 @@ static inline dma_addr_t dma_direct_map_phys(struct device *dev,
>>> dma_addr_t dma_addr;
>>>
>>> if (is_swiotlb_force_bounce(dev)) {
>>> - if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT))
>>> - return DMA_MAPPING_ERROR;
>>> + if (!(attrs & DMA_ATTR_CC_SHARED)) {
>>> + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT))
>>> + return DMA_MAPPING_ERROR;
>>>
>>> - return swiotlb_map(dev, phys, size, dir, attrs);
>>> + return swiotlb_map(dev, phys, size, dir, attrs);
>>> + }
>>> + } else if (attrs & DMA_ATTR_CC_SHARED) {
>>> + return DMA_MAPPING_ERROR;
>>> }
>>>
>>
>>What is this check for? If we are requesting a DMA mapping with
>>DMA_ATTR_CC_SHARED, shouldn’t it be allowed? If not, how would we reach
>
> This is defensive. Only allows to map with DMA_ATTR_CC_SHARED set to
> dev dev that does not support CC natively. This can be of course lifted,
> if you have a case.
>
>
>>the conditional below where we convert the physical address to a DMA
>>address using phys_to_dma_unencrypted()?. Also, how is this supposed to
>>interact with is_swiotlb_force_bounce()?”
>
> You reach there when is_swiotlb_force_bounce(dev) is true and
> DMA_ATTR_CC_SHARED is set. What am I missing?
>
So a swiotlb_force_bounce will not use swiotlb bouncing if
DMA_ATTR_CC_SHARED is set ?
>
>
>>
>>>
>>> if (attrs & DMA_ATTR_MMIO) {
>>> dma_addr = phys;
>>> if (unlikely(!dma_capable(dev, dma_addr, size, false)))
>>> goto err_overflow;
>>> + } else if (attrs & DMA_ATTR_CC_SHARED) {
>>> + dma_addr = phys_to_dma_unencrypted(dev, phys);
>>> + if (unlikely(!dma_capable(dev, dma_addr, size, false)))
>>> + goto err_overflow;
>>> } else {
>>> dma_addr = phys_to_dma(dev, phys);
>>> if (unlikely(!dma_capable(dev, dma_addr, size, true)) ||
>>
-aneesh
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 3:17 UTC (permalink / raw)
To: Moger, Babu, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <e8530c71-fde2-4522-8b46-a24efb13b681@amd.com>
Hi Babu,
On 4/20/26 5:40 PM, Moger, Babu wrote:
>
> We already discussed moving back to the default group on every mode
> switch. Doing so here would once again cause extra MSR writes on
> each mode transition, which is undesirable.
>
Needing to avoid extra MSR writes in resctrl is not so absolute. Consider, for
example, how resctrl initializes default allocations when a new resource group is
created. resctrl aims to initialize with sane defaults and the user is expected to
follow with desired allocations.
I am not against optimizing, I just want to be careful with such general statements.
Considering your proposal in https://lore.kernel.org/lkml/39e0c786-cc35-4555-bfb9-ff7cd758c423@amd.com/:
I do not think we should make info/kernel_mode read-only. If I understand correctly
doing so would accommodate AMD PLZA but it ignores the discussions on how resctrl could
support MPAM ... or do you perhaps have proposal on how MPAM can be supported when considering
your proposal? Even if you do not want to consider MPAM - what if the PLZA_PQR register's
scope becomes per-CPU in the next version of AMD PLZA?
The idea behind info/kernel_mode is that the active mode it identifies indicates which
configuration files exist to configure the active mode. Since the mode may not always
depend on global configuration, for which info/kernel_mode_assignment was created, but instead
rely on per-resource group files, I do not see how resctrl can build on a read-only
info/kernel_mode backed by a mode and group change via info/kernel_mode_assignment.
Specifically, MPAM support may not use info/kernel_mode_assignment at all.
Instead, MPAM may use something like described in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@e134344.arm.com/
Could we perhaps consider dropping info/kernel_mode_assignment entirely for
AMD PLZA's global allocations? Similar to what you suggest, the mode and
group assignment could be done via the info/kernel_mode file instead?
Thinking about this more since the CPUs allocation is global, these could *theoretically*
be included also (but see later).
This could mean that "kernel_mode_cpus" and "kernel_mode_cpus_list" could be dropped?
Although, this may complicate the interface since user space may want a convenient way
to modify just CPUs independently from needing to repeat the mode and group every time.
Consider, for example:
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/;cpus_list=5-8" > info/kernel_mode
Having named fields (a) makes this extensible, (b) output does not need to be split among files,
and (c) "inherit_ctrl_and_mon" can continue to be supported.
The named fields could be made optional, if group is omitted then it will become the
default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
This may not be intuitive since a user may expect that not mentioning a field means
that the field is left untouched. Have you considered this scenario in your proposal?
As an alternative the group could be made a required field and "kernel_mode_cpus"/"kernel_mode_cpuslist"
can stay? This may be the simplest approach.
Output could still use [] to indicate the active mode that includes its properties.
I find to be more intuitive interface where output more closely matches input.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Moger, Babu @ 2026-04-21 0:40 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <741aa53e-461c-4a1a-a701-6060d42012f8@intel.com>
Hi Reinette,
On 4/20/2026 6:34 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/20/26 3:59 PM, Moger, Babu wrote:
>> On 4/20/2026 5:03 PM, Reinette Chatre wrote:
>>> On 4/20/26 12:38 PM, Babu Moger wrote:
>
>>>> The current mode change behavior is very restrictive.
>>>>
>>>> For example:
>>>>
>>>> # cat info/kernel_mode
>>>> inherit_ctrl_and_mon
>>>> [global_assign_ctrl_assign_mon_per_cpu]
>>>> global_assign_ctrl_inherit_mon_per_cpu
>>>>
>>>>
>>>> # cat info/kernel_mode_assignment
>>>> ctrl1/mon1/
>>>>
>>>> In this state, we cannot change kernel_mode to inherit_ctrl_and_mon. The expectation, however, is that inherit_ctrl_and_mon should always map to the RDTCTRL_GROUP.
>>>
>>> Could you please provide details behind the "we cannot change kernel_mode to
>>> inherit_ctrl_and_mon" statement? Why is this not possible?
>>>
>>> I do not see "inherit_ctrl_and_mon" to map to *any* group though. Expectation is
>>> that when user changes mode to "inherit_ctrl_and_mon" then
>>> info/kernel_mode_assignment would become invisible to user space.
>>
>> Ok. That is fine.
>>
>>
>> Sorry for not making it clear. Let’s consider the following scenario.
>>
>> The system boots with these default settings:
>>
>> # cat info/kernel_mode
>> [inherit_ctrl_and_mon]
>> global_assign_ctrl_assign_mon_per_cpu
>> global_assign_ctrl_inherit_mon_per_cpu
>>
>>
>> At this point, the interface info/kernel_mode_assignment is not visible.
>>
>> Next, lets create a new control group:
>>
>> # mkdir ctrl1
>>
>> We want to designate this group as the new kernel-mode group.
>>
>> First operation: Change the mode:
>>
>> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
>>
>> At this stage, only the kernel mode is being changed. However, there is no way to know which control group the user intends to assign to kernel mode. All we know here is the selected mode.
>>
>> After this operation, the info/kernel_mode_assignment interface should become visible. But the question is: what should it contain or point to at this moment?
>
> This was considered as part of original proposal per
> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
> (search for "default value") where the idea was that the group
> should be initialized to the default group.
>
>>
>> # cat info/kernel_mode_assignment
>> ??
>
> After
> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
> # cat info/kernel_mode_assignment
> /
>
> After
> # echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
> # cat info/kernel_mode_assignment
> //
>
> (although this is where previous discussion comes in on how interface
> can become inconsistent depending on what the previous kernel mode was)
This operation effectively promotes the default group (CLOSID 0) to the
kernel-mode group. Consequently, MSRs will be programmed on all threads,
which is not the user’s intent.
>
>>
>> Next operation: Assign the group
>>
>> # echo "ctrl1//" > info/kernel_mode_assignment
>>
Once again, this causes MSRs to be programmed with a new CLOSID(ctrl1)
which is actual intended result.
>>
>> Now the intended control group (ctrl1) is explicitly specified for kernel mode. In summary, changing the kernel mode requires two distinct inputs:
>>
>> - Selecting the kernel mode.
>> - Specifying the control group to be used for that mode.
>>
>>
>> Hope this makes sense.
>>
> Understood. Could you please elaborate what the problem is with making it so?
> Are you trying to eliminate one per-CPU register write? Is this something that
> ends up being very expensive? I assumed that a register designed to support
> modification during context switch should be fast. Or is it the IPI you are
> concerned about? Please help me to understand what the actual problem is that
> you are trying to solve.
> I think it is reasonable to start with defaults when changing the mode which
> I do not expect users to change often.
Note that these MSR writes are not occurring in the context-switch path.
However, every time the kernel mode is changed, we end up performing an
additional set of MSR writes, which is unnecessary overhead.
There is also another issue, as previously discussed: switching between
global_assign_ctrl_assign_mon_per_cpu and
global_assign_ctrl_inherit_mon_per_cpu, and vice versa.
One mode requires a CTRL_MON group, while the other requires a MON
group. Because of this mismatch in required group types, switching
between these modes is not possible.
We already discussed moving back to the default group on every mode
switch. Doing so here would once again cause extra MSR writes on each
mode transition, which is undesirable.
Thanks
Babu
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-21 0:21 UTC (permalink / raw)
To: Luck, Tony, Moger, Babu
Cc: Babu Moger, corbet@lwn.net, Dave.Martin@arm.com,
james.morse@arm.com, tglx@kernel.org, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com,
skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <aea-wdaZAWl2Al1h@agluck-desk3>
Hi Tony,
On 4/20/26 5:03 PM, Luck, Tony wrote:
>> The system boots with these default settings:
>>
>> # cat info/kernel_mode
>> [inherit_ctrl_and_mon]
>> global_assign_ctrl_assign_mon_per_cpu
>> global_assign_ctrl_inherit_mon_per_cpu
>>
>>
>> At this point, the interface info/kernel_mode_assignment is not visible.
>>
>> Next, lets create a new control group:
>>
>> # mkdir ctrl1
>
> This allocates a CLOSID and an RMID for this group.
>
>> We want to designate this group as the new kernel-mode group.
>>
>> First operation: Change the mode:
>>
>> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
>
> This mode needs a CLOSID for PLZA, but doesn't need an RMID.
>
>> At this stage, only the kernel mode is being changed. However, there is no
>> way to know which control group the user intends to assign to kernel mode.
>> All we know here is the selected mode.
>>
>> After this operation, the info/kernel_mode_assignment interface should
>> become visible. But the question is: what should it contain or point to at
>> this moment?
>>
>> # cat info/kernel_mode_assignment
>> ??
>>
>> Next operation: Assign the group
>>
>> # echo "ctrl1//" > info/kernel_mode_assignment
>
> Now ring0 code is using the CLOSID from the ctrl1 group.
... and user space tasks also continue to use the CLOSID from the
ctrl1 group.
It is up to user space to decide if a group is dedicated to kernel
mode or not. resctrl does not enforce it.
>
> But the RMID for this group isn't used.
RMID is still used by user mode that maintains existing behavior concerning
this group when considering its tasks/cpus/cpus_list files. RMID assigned to this
group is just not used for kernel mode.
>
> Are we OK with "wasting" an RMID in this way?
How do you see this RMID as "wasted"?
>
> Maybe it doesn't matter too much for AMD as you would just
> avoid assigning any counters to this group. But should Intel
> get around to doing PLZA-like functionality, that's a real
> loss of an RMID that might be useful elsewhere.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Luck, Tony @ 2026-04-21 0:03 UTC (permalink / raw)
To: Moger, Babu
Cc: Reinette Chatre, Babu Moger, corbet@lwn.net, Dave.Martin@arm.com,
james.morse@arm.com, tglx@kernel.org, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com,
skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <99a2da36-6a21-4a99-98e0-3c9a4cf7ecf6@amd.com>
> The system boots with these default settings:
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
> global_assign_ctrl_assign_mon_per_cpu
> global_assign_ctrl_inherit_mon_per_cpu
>
>
> At this point, the interface info/kernel_mode_assignment is not visible.
>
> Next, lets create a new control group:
>
> # mkdir ctrl1
This allocates a CLOSID and an RMID for this group.
> We want to designate this group as the new kernel-mode group.
>
> First operation: Change the mode:
>
> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
This mode needs a CLOSID for PLZA, but doesn't need an RMID.
> At this stage, only the kernel mode is being changed. However, there is no
> way to know which control group the user intends to assign to kernel mode.
> All we know here is the selected mode.
>
> After this operation, the info/kernel_mode_assignment interface should
> become visible. But the question is: what should it contain or point to at
> this moment?
>
> # cat info/kernel_mode_assignment
> ??
>
> Next operation: Assign the group
>
> # echo "ctrl1//" > info/kernel_mode_assignment
Now ring0 code is using the CLOSID from the ctrl1 group.
But the RMID for this group isn't used.
Are we OK with "wasting" an RMID in this way?
Maybe it doesn't matter too much for AMD as you would just
avoid assigning any counters to this group. But should Intel
get around to doing PLZA-like functionality, that's a real
loss of an RMID that might be useful elsewhere.
-Tony
^ permalink raw reply
* Re: [PATCH kernel 4/9] dma/swiotlb: Stop forcing SWIOTLB for TDISP devices
From: Jason Gunthorpe @ 2026-04-20 23:50 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: dan.j.williams, Robin Murphy, x86, linux-kernel, kvm, linux-pci,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Sean Christopherson, Paolo Bonzini,
Andy Lutomirski, Peter Zijlstra, Bjorn Helgaas, Marek Szyprowski,
Andrew Morton, Catalin Marinas, Michael Ellerman, Mike Rapoport,
Tom Lendacky, Ard Biesheuvel, Ashish Kalra, Stefano Garzarella,
Melody Wang, Seongman Lee, Joerg Roedel, Nikunj A Dadhania,
Michael Roth, Suravee Suthikulpanit, Andi Kleen,
Kuppuswamy Sathyanarayanan, Tony Luck, David Woodhouse,
Greg Kroah-Hartman, Denis Efremov, Geliang Tang, Piotr Gregor,
Michael S. Tsirkin, Alex Williamson, Arnd Bergmann, Jesse Barnes,
Jacob Pan, Yinghai Lu, Kevin Brodsky, Jonathan Cameron,
Aneesh Kumar K.V (Arm), Xu Yilun, Herbert Xu, Kim Phillips,
Konrad Rzeszutek Wilk, Stefano Stabellini, Claire Chang,
linux-coco, iommu
In-Reply-To: <137e5595-390e-49a7-8918-9ca057f7ebdd@amd.com>
On Wed, Apr 15, 2026 at 04:32:14PM +1000, Alexey Kardashevskiy wrote:
> > > So the DMA API should see the DMA_ATTR_CC_DECRYPTED and setup the
> > > correct dma_dddr_t either by choosing the shared alias for the TDISP
> > > device's vTOM, or setting the C bit in a vIOMMU S1.
> >
> > Something like that?
> >
> > https://github.com/AMDESE/linux-kvm/commit/266a41a1ea746557eb63debce886ce2c98820667
> >
> > With some little hacks I can make this tree do TDISP DMA to private or shared (swiotlb) memory by steering via this vTOM thing. Thanks,
>
> Ping? Thanks,
That seems approx right, it is broadl similar to what ARM is
doing.. But the address map changes when switching to T=1 for AMD?
Jason
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-20 23:34 UTC (permalink / raw)
To: Moger, Babu, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <99a2da36-6a21-4a99-98e0-3c9a4cf7ecf6@amd.com>
Hi Babu,
On 4/20/26 3:59 PM, Moger, Babu wrote:
> On 4/20/2026 5:03 PM, Reinette Chatre wrote:
>> On 4/20/26 12:38 PM, Babu Moger wrote:
>>> The current mode change behavior is very restrictive.
>>>
>>> For example:
>>>
>>> # cat info/kernel_mode
>>> inherit_ctrl_and_mon
>>> [global_assign_ctrl_assign_mon_per_cpu]
>>> global_assign_ctrl_inherit_mon_per_cpu
>>>
>>>
>>> # cat info/kernel_mode_assignment
>>> ctrl1/mon1/
>>>
>>> In this state, we cannot change kernel_mode to inherit_ctrl_and_mon. The expectation, however, is that inherit_ctrl_and_mon should always map to the RDTCTRL_GROUP.
>>
>> Could you please provide details behind the "we cannot change kernel_mode to
>> inherit_ctrl_and_mon" statement? Why is this not possible?
>>
>> I do not see "inherit_ctrl_and_mon" to map to *any* group though. Expectation is
>> that when user changes mode to "inherit_ctrl_and_mon" then
>> info/kernel_mode_assignment would become invisible to user space.
>
> Ok. That is fine.
>
>
> Sorry for not making it clear. Let’s consider the following scenario.
>
> The system boots with these default settings:
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
> global_assign_ctrl_assign_mon_per_cpu
> global_assign_ctrl_inherit_mon_per_cpu
>
>
> At this point, the interface info/kernel_mode_assignment is not visible.
>
> Next, lets create a new control group:
>
> # mkdir ctrl1
>
> We want to designate this group as the new kernel-mode group.
>
> First operation: Change the mode:
>
> # echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
>
> At this stage, only the kernel mode is being changed. However, there is no way to know which control group the user intends to assign to kernel mode. All we know here is the selected mode.
>
> After this operation, the info/kernel_mode_assignment interface should become visible. But the question is: what should it contain or point to at this moment?
This was considered as part of original proposal per
https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
(search for "default value") where the idea was that the group
should be initialized to the default group.
>
> # cat info/kernel_mode_assignment
> ??
After
# echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
# cat info/kernel_mode_assignment
/
After
# echo "global_assign_ctrl_assign_mon_per_cpu" > info/kernel_mode
# cat info/kernel_mode_assignment
//
(although this is where previous discussion comes in on how interface
can become inconsistent depending on what the previous kernel mode was)
>
> Next operation: Assign the group
>
> # echo "ctrl1//" > info/kernel_mode_assignment
>
>
> Now the intended control group (ctrl1) is explicitly specified for kernel mode. In summary, changing the kernel mode requires two distinct inputs:
>
> - Selecting the kernel mode.
> - Specifying the control group to be used for that mode.
>
>
> Hope this makes sense.
>
Understood. Could you please elaborate what the problem is with making it so?
Are you trying to eliminate one per-CPU register write? Is this something that
ends up being very expensive? I assumed that a register designed to support
modification during context switch should be fast. Or is it the IPI you are
concerned about? Please help me to understand what the actual problem is that
you are trying to solve.
I think it is reasonable to start with defaults when changing the mode which
I do not expect users to change often.
Reinette
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Moger, Babu @ 2026-04-20 22:59 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <b74cfe34-e23e-49e3-beb4-d5639d42d5cc@intel.com>
Hi Reinette,
On 4/20/2026 5:03 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/20/26 12:38 PM, Babu Moger wrote:
>> On 4/9/26 22:41, Reinette Chatre wrote:
>>> On 4/9/26 4:42 PM, Moger, Babu wrote:
>>>> On 4/9/2026 3:50 PM, Reinette Chatre wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On 4/9/26 11:05 AM, Moger, Babu wrote:
>>>>>> On 4/9/2026 12:26 PM, Reinette Chatre wrote:
>>>>>>> On 4/9/26 10:19 AM, Moger, Babu wrote:
>>>>>>>> On 4/8/2026 6:41 PM, Reinette Chatre wrote:
>>>>>>>
>>>>>>>>> When the user switches to either "global_assign_ctrl_inherit_mon_per_cpu" or
>>>>>>>>> 'global_assign_ctrl_assign_mon_per_cpu" then "info/kernel_mode_assignment" is created
>>>>>>>>> (or made visible to user space) and is expected to point to default group.
>>>>>>>>> User can change the group using "info/kernel_mode_assignment" at this point.
>>>>>>>>>
>>>>>>>>> If the current scenario is below ...
>>>>>>>>> # cat info/kernel_mode
>>>>>>>>> [global_assign_ctrl_inherit_mon_per_cpu]
>>>>>>>>> inherit_ctrl_and_mon
>>>>>>>>> global_assign_ctrl_assign_mon_per_cpu
>>>>>>>>>
>>>>>>>>> ... then "info/kernel_mode_assignment" will exist but what it should contain if
>>>>>>>>> user switches mode at this point may be up for discussion.
>>>>>>>>>
>>>>>>>>> option 1)
>>>>>>>>> When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
>>>>>>>>> the resource group in "info/kernel_mode_assignment" is reset to the
>>>>>>>>> default group and all CPUs PLZA state reset to match. The kernel_mode_cpus
>>>>>>>>> and kernel_mode_cpuslist files become visible in default resource group
>>>>>>>>> and they contain "all online CPUs".
>>>>>>>>>
>>>>>>>>> option 2)
>>>>>>>>> When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
>>>>>>>>> the resource group in "info/kernel_mode_assignment" is kept and all
>>>>>>>>> CPUs PLZA state set to match it while also keeping the current
>>>>>>>>> values of that resource group's kernel_mode_cpus and kernel_mode_cpuslist
>>>>>>>>> files.
>>>>>>>>>
>>>>>>>>> I am leaning towards "option 1" to keep it consistent with a switch from
>>>>>>>>> "inherit_ctrl_and_mon" and being deterministic about how a mode is started with
>>>>>>>>
>>>>>>>> Yes. The "option 1" seems appropriate.
>>>>>>>>
>>>>>>>>> a clean slate. What are your thoughts? What would be use case where a user would
>>>>>>>>> want to switch between "global_assign_ctrl_inherit_mon_per_cpu" and
>>>>>>>>> "global_assign_ctrl_assign_mon_per_cpu" to just switch rmid_en on and off?
>>>>>>>>
>>>>>>>>
>>>>>>>> This is a bit tricky.
>>>>>>>>
>>>>>>>> Currently, our requirement is to have a CTRL_MON group for
>>>>>>>> global_assign_ctrl_inherit_mon_per_cpu. In this scenario, we use the
>>>>>>>> group’s CLOSID for PLZA configuration, and RMID is not used (rmid_en
>>>>>>>> = 0) when setting up PLZA.
>>>>>>>>
>>>>>>>> Our requirement is also to have a CTRL_MON/MON group for
>>>>>>>> global_assign_ctrl_assign_mon_per_cpu. In this case as well, the
>>>>>>>> group’s CLOSID and RMID (rmid_en = 1) both are used configure PLZA.
>>>>>>>
>>>>>>> ah, right. Good catch.
>>>>>>>
>>>>>>>>
>>>>>>>> Actually, we should not allow these changes from
>>>>>>>> global_assign_ctrl_inherit_mon_per_cpu to
>>>>>>>> global_assign_ctrl_assign_mon_per_cpu or visa versa.
>>>>>>>
>>>>>>> resctrl could allow it but as part of the switch it resets the "kernel mode group" to
>>>>>>> be the default group every time? This would be the "option 1" above.
>>>>>>
>>>>>> Other options.
>>>>>>
>>>>>> Allow global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_assign_mon_per_cpu. As part of the switch, reset the "kernel mode group" to the default group.
>>>>>>
>>>>>> Allow global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_inherit_mon_per_cpu. In this case switch
>>>>>> to CTRL_MON/MON -> CTRL_MON.
>>>>>>
>>>>>
>>>>> ok. Could you please return the courtesy of providing feedback on the
>>>>> suggestion you are responding to and also include the motivation why your
>>>>> suggestion is the better option?
>>>>
>>>> Yea. Sure.
>>>>
>>>> We need to allow the switch between the modes. Otherwise only way to reset is to remount the resctrl filesystem. That is not a good option.
>>>>
>>>> Allow global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_assign_mon_per_cpu. As part of the switch, reset the "kernel mode group" to the default group.
>>>>
>>>> This option is same as you suggested.
>>>>
>>>> Allow global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_inherit_mon_per_cpu. In this case switch
>>>> to CTRL_MON/MON -> CTRL_MON. This option basically disables monitor (rmid_en=0). It is less disruptive. Move is between child group to parent group.
>>>
>>> ok. I am concerned that this creates an inconsistent interface. Specifically, sometimes
>>> when switching the mode the kernel group will reset and sometimes it won't. This inconsistency
>>> may be more apparent when writing the user documentation as part of this work. If you are
>>> able to clearly explain how this resctrl fs interface behaves (this cannot be about PLZA
>>> internals as above) then this could work.
>> Started working on these changes. May be it is better to discuss this before to avoid one more revision.
>>
>>
>> The current mode change behavior is very restrictive.
>>
>> For example:
>>
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon
>> [global_assign_ctrl_assign_mon_per_cpu]
>> global_assign_ctrl_inherit_mon_per_cpu
>>
>>
>> # cat info/kernel_mode_assignment
>> ctrl1/mon1/
>>
>> In this state, we cannot change kernel_mode to inherit_ctrl_and_mon. The expectation, however, is that inherit_ctrl_and_mon should always map to the RDTCTRL_GROUP.
>
> Could you please provide details behind the "we cannot change kernel_mode to
> inherit_ctrl_and_mon" statement? Why is this not possible?
>
> I do not see "inherit_ctrl_and_mon" to map to *any* group though. Expectation is
> that when user changes mode to "inherit_ctrl_and_mon" then
> info/kernel_mode_assignment would become invisible to user space.
Ok. That is fine.
Sorry for not making it clear. Let’s consider the following scenario.
The system boots with these default settings:
# cat info/kernel_mode
[inherit_ctrl_and_mon]
global_assign_ctrl_assign_mon_per_cpu
global_assign_ctrl_inherit_mon_per_cpu
At this point, the interface info/kernel_mode_assignment is not visible.
Next, lets create a new control group:
# mkdir ctrl1
We want to designate this group as the new kernel-mode group.
First operation: Change the mode:
# echo "global_assign_ctrl_inherit_mon_per_cpu" > info/kernel_mode
At this stage, only the kernel mode is being changed. However, there is
no way to know which control group the user intends to assign to kernel
mode. All we know here is the selected mode.
After this operation, the info/kernel_mode_assignment interface should
become visible. But the question is: what should it contain or point to
at this moment?
# cat info/kernel_mode_assignment
??
Next operation: Assign the group
# echo "ctrl1//" > info/kernel_mode_assignment
Now the intended control group (ctrl1) is explicitly specified for
kernel mode. In summary, changing the kernel mode requires two distinct
inputs:
- Selecting the kernel mode.
- Specifying the control group to be used for that mode.
Hope this makes sense.
Thanks
Babu
>
>>
>>
>> A similar issue exists when switching between
>> global_assign_ctrl_inherit_mon_per_cpu and
>> global_assign_ctrl_assign_mon_per_cpu (in either direction).
>
> What similar issue? Could you please provide some detail to help me understand what the
> issue is? Isn't this what we just discussed in thread you are replying to? That is, you were
> looking at developing that interface that I viewed as "inconsistent"?
>
>>
>> The same problem also occurs when modifying the kernel_mode_assignment group. If the current group is an RDTMON_GROUP, we can't assign another
>> RDTCTRL_GROUP without changing both mode and group together.
>
> Same problem? Still unclear what the problem is. So far three problems are mentioned but I am
> not able to decipher what the problems are. Could you please elaborate?
> When modifying the kernel_mode_assignment group I expect that the interface
> will only accept a MON group when in "assign_mon" mode and a CTRL group when
> in "inherit_mon" mode.
> I do not understand what you mean with *another* RDTCTRL_GROUP. Only one group
> can be assigned at any time, no?
>
> Reinette
>
^ permalink raw reply
* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-20 22:03 UTC (permalink / raw)
To: Babu Moger, Moger, Babu, corbet@lwn.net, tony.luck@intel.com,
Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
akpm@linux-foundation.org, pmladek@suse.com,
rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
kees@kernel.org, elver@google.com, paulmck@kernel.org,
lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
Lendacky, Thomas, elena.reshetova@intel.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org,
eranian@google.com, peternewman@google.com
In-Reply-To: <39e0c786-cc35-4555-bfb9-ff7cd758c423@amd.com>
Hi Babu,
On 4/20/26 12:38 PM, Babu Moger wrote:
> On 4/9/26 22:41, Reinette Chatre wrote:
>> On 4/9/26 4:42 PM, Moger, Babu wrote:
>>> On 4/9/2026 3:50 PM, Reinette Chatre wrote:
>>>> Hi Babu,
>>>>
>>>> On 4/9/26 11:05 AM, Moger, Babu wrote:
>>>>> On 4/9/2026 12:26 PM, Reinette Chatre wrote:
>>>>>> On 4/9/26 10:19 AM, Moger, Babu wrote:
>>>>>>> On 4/8/2026 6:41 PM, Reinette Chatre wrote:
>>>>>>
>>>>>>>> When the user switches to either "global_assign_ctrl_inherit_mon_per_cpu" or
>>>>>>>> 'global_assign_ctrl_assign_mon_per_cpu" then "info/kernel_mode_assignment" is created
>>>>>>>> (or made visible to user space) and is expected to point to default group.
>>>>>>>> User can change the group using "info/kernel_mode_assignment" at this point.
>>>>>>>>
>>>>>>>> If the current scenario is below ...
>>>>>>>> # cat info/kernel_mode
>>>>>>>> [global_assign_ctrl_inherit_mon_per_cpu]
>>>>>>>> inherit_ctrl_and_mon
>>>>>>>> global_assign_ctrl_assign_mon_per_cpu
>>>>>>>>
>>>>>>>> ... then "info/kernel_mode_assignment" will exist but what it should contain if
>>>>>>>> user switches mode at this point may be up for discussion.
>>>>>>>>
>>>>>>>> option 1)
>>>>>>>> When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
>>>>>>>> the resource group in "info/kernel_mode_assignment" is reset to the
>>>>>>>> default group and all CPUs PLZA state reset to match. The kernel_mode_cpus
>>>>>>>> and kernel_mode_cpuslist files become visible in default resource group
>>>>>>>> and they contain "all online CPUs".
>>>>>>>>
>>>>>>>> option 2)
>>>>>>>> When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
>>>>>>>> the resource group in "info/kernel_mode_assignment" is kept and all
>>>>>>>> CPUs PLZA state set to match it while also keeping the current
>>>>>>>> values of that resource group's kernel_mode_cpus and kernel_mode_cpuslist
>>>>>>>> files.
>>>>>>>>
>>>>>>>> I am leaning towards "option 1" to keep it consistent with a switch from
>>>>>>>> "inherit_ctrl_and_mon" and being deterministic about how a mode is started with
>>>>>>>
>>>>>>> Yes. The "option 1" seems appropriate.
>>>>>>>
>>>>>>>> a clean slate. What are your thoughts? What would be use case where a user would
>>>>>>>> want to switch between "global_assign_ctrl_inherit_mon_per_cpu" and
>>>>>>>> "global_assign_ctrl_assign_mon_per_cpu" to just switch rmid_en on and off?
>>>>>>>
>>>>>>>
>>>>>>> This is a bit tricky.
>>>>>>>
>>>>>>> Currently, our requirement is to have a CTRL_MON group for
>>>>>>> global_assign_ctrl_inherit_mon_per_cpu. In this scenario, we use the
>>>>>>> group’s CLOSID for PLZA configuration, and RMID is not used (rmid_en
>>>>>>> = 0) when setting up PLZA.
>>>>>>>
>>>>>>> Our requirement is also to have a CTRL_MON/MON group for
>>>>>>> global_assign_ctrl_assign_mon_per_cpu. In this case as well, the
>>>>>>> group’s CLOSID and RMID (rmid_en = 1) both are used configure PLZA.
>>>>>>
>>>>>> ah, right. Good catch.
>>>>>>
>>>>>>>
>>>>>>> Actually, we should not allow these changes from
>>>>>>> global_assign_ctrl_inherit_mon_per_cpu to
>>>>>>> global_assign_ctrl_assign_mon_per_cpu or visa versa.
>>>>>>
>>>>>> resctrl could allow it but as part of the switch it resets the "kernel mode group" to
>>>>>> be the default group every time? This would be the "option 1" above.
>>>>>
>>>>> Other options.
>>>>>
>>>>> Allow global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_assign_mon_per_cpu. As part of the switch, reset the "kernel mode group" to the default group.
>>>>>
>>>>> Allow global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_inherit_mon_per_cpu. In this case switch
>>>>> to CTRL_MON/MON -> CTRL_MON.
>>>>>
>>>>
>>>> ok. Could you please return the courtesy of providing feedback on the
>>>> suggestion you are responding to and also include the motivation why your
>>>> suggestion is the better option?
>>>
>>> Yea. Sure.
>>>
>>> We need to allow the switch between the modes. Otherwise only way to reset is to remount the resctrl filesystem. That is not a good option.
>>>
>>> Allow global_assign_ctrl_inherit_mon_per_cpu -> global_assign_ctrl_assign_mon_per_cpu. As part of the switch, reset the "kernel mode group" to the default group.
>>>
>>> This option is same as you suggested.
>>>
>>> Allow global_assign_ctrl_assign_mon_per_cpu -> global_assign_ctrl_inherit_mon_per_cpu. In this case switch
>>> to CTRL_MON/MON -> CTRL_MON. This option basically disables monitor (rmid_en=0). It is less disruptive. Move is between child group to parent group.
>>
>> ok. I am concerned that this creates an inconsistent interface. Specifically, sometimes
>> when switching the mode the kernel group will reset and sometimes it won't. This inconsistency
>> may be more apparent when writing the user documentation as part of this work. If you are
>> able to clearly explain how this resctrl fs interface behaves (this cannot be about PLZA
>> internals as above) then this could work.
> Started working on these changes. May be it is better to discuss this before to avoid one more revision.
>
>
> The current mode change behavior is very restrictive.
>
> For example:
>
> # cat info/kernel_mode
> inherit_ctrl_and_mon
> [global_assign_ctrl_assign_mon_per_cpu]
> global_assign_ctrl_inherit_mon_per_cpu
>
>
> # cat info/kernel_mode_assignment
> ctrl1/mon1/
>
> In this state, we cannot change kernel_mode to inherit_ctrl_and_mon. The expectation, however, is that inherit_ctrl_and_mon should always map to the RDTCTRL_GROUP.
Could you please provide details behind the "we cannot change kernel_mode to
inherit_ctrl_and_mon" statement? Why is this not possible?
I do not see "inherit_ctrl_and_mon" to map to *any* group though. Expectation is
that when user changes mode to "inherit_ctrl_and_mon" then
info/kernel_mode_assignment would become invisible to user space.
>
>
> A similar issue exists when switching between
> global_assign_ctrl_inherit_mon_per_cpu and
> global_assign_ctrl_assign_mon_per_cpu (in either direction).
What similar issue? Could you please provide some detail to help me understand what the
issue is? Isn't this what we just discussed in thread you are replying to? That is, you were
looking at developing that interface that I viewed as "inconsistent"?
>
> The same problem also occurs when modifying the kernel_mode_assignment group. If the current group is an RDTMON_GROUP, we can't assign another
> RDTCTRL_GROUP without changing both mode and group together.
Same problem? Still unclear what the problem is. So far three problems are mentioned but I am
not able to decipher what the problems are. Could you please elaborate?
When modifying the kernel_mode_assignment group I expect that the interface
will only accept a MON group when in "assign_mon" mode and a CTRL group when
in "inherit_mon" mode.
I do not understand what you mean with *another* RDTCTRL_GROUP. Only one group
can be assigned at any time, no?
Reinette
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox