From: Reinette Chatre <reinette.chatre@intel.com>
To: Babu Moger <babu.moger@amd.com>, <corbet@lwn.net>,
<tony.luck@intel.com>, <Dave.Martin@arm.com>,
<james.morse@arm.com>, <tglx@kernel.org>, <mingo@redhat.com>,
<bp@alien8.de>, <dave.hansen@linux.intel.com>
Cc: <skhan@linuxfoundation.org>, <x86@kernel.org>, <hpa@zytor.com>,
<peterz@infradead.org>, <juri.lelli@redhat.com>,
<vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>,
<rostedt@goodmis.org>, <bsegall@google.com>, <mgorman@suse.de>,
<vschneid@redhat.com>, <kas@kernel.org>,
<rick.p.edgecombe@intel.com>, <akpm@linux-foundation.org>,
<pmladek@suse.com>, <rdunlap@infradead.org>,
<dapeng1.mi@linux.intel.com>, <kees@kernel.org>,
<elver@google.com>, <paulmck@kernel.org>, <lirongqing@baidu.com>,
<safinaskar@gmail.com>, <fvdl@google.com>, <seanjc@google.com>,
<pawan.kumar.gupta@linux.intel.com>, <xin@zytor.com>,
<tiala@microsoft.com>, <Neeraj.Upadhyay@amd.com>,
<chang.seok.bae@intel.com>, <thomas.lendacky@amd.com>,
<elena.reshetova@intel.com>, <linux-doc@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-coco@lists.linux.dev>,
<kvm@vger.kernel.org>, <eranian@google.com>,
<peternewman@google.com>
Subject: Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Date: Fri, 27 Mar 2026 15:11:44 -0700 [thread overview]
Message-ID: <88eebfac-5286-4788-b244-911c659c0439@intel.com> (raw)
In-Reply-To: <47c0db32-d0e0-4c53-90bd-b74863d233dc@amd.com>
Hi Babu,
On 3/26/26 10:12 AM, Babu Moger wrote:
> Hi Reinette,
>
> Thanks for the review comments. Will address one by one.
>
> On 3/24/26 17:51, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/12/26 1:36 PM, Babu Moger wrote:
>>> This series adds support for Privilege-Level Zero Association (PLZA) to the
>>> resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
>>> and/or RMID for execution in kernel mode (privilege level zero), so that
>>> kernel work is not subject to the same resource constrains as the current
>>> user-space task. This avoids kernel operations being aggressively throttled
>>> when a task's memory bandwidth is heavily limited.
>>>
>>> The feature documentation is not yet publicly available, but it is expected
>>> to be released in the next few weeks. In the meantime, a brief description
>>> of the features is provided below.
>>>
>>> Privilege Level Zero Association (PLZA)
>>>
>>> Privilege Level Zero Association (PLZA) allows the hardware to
>>> automatically associate execution in Privilege Level Zero (CPL=0) with a
>>> specific COS (Class of Service) and/or RMID (Resource Monitoring
>>> Identifier). The QoS feature set already has a mechanism to associate
>>> execution on each logical processor with an RMID or COS. PLZA allows the
>>> system to override this per-thread association for a thread that is
>>> executing with CPL=0.
>>> ------------------------------------------------------------------------
>>>
>>> The series introduces the feature in a way that supports the interface in
>>> a generic manner to accomodate MPAM or other vendor specific implimentation.
>>>
>>> Below is the detailed requirements provided by Reinette:
>>> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
>> Our discussion considered how resctrl could support PLZA in a generic way while
>> also preparing to support MPAM's variants and how PLZA may evolve to have similar
>> capabilities when considering the capabilities of its registers.
>>
>> This does not mean that your work needs to implement everything that was discussed.
>> Instead, this work is expected to just support what PLZA is capable of today but
>> do so in a way that the future enhancements could be added to.
>>
>> This series is quite difficult to follow since it appears to implement a full
>> featured generic interface while PLZA cannot take advantage of it.
>>
>> Could you please simplify this work to focus on just enabling PLZA and only
>> add interfaces needed to do so?
> Sure. Will try. Lets continue the discussion.
>>
>>> Summary:
>>> 1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
>>> info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
>>>
>>> 2. Add two info files
>>>
>>> a. kernel_mode
>>> Purpose: Control how resource allocation and monitoring apply in kernel mode
>>> (e.g. inherit from task vs global assign).
>>>
>>> Read: List supported modes and show current one (e.g. with [brackets]).
>>> Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
>>>
>>> b. kernel_mode_assignment
>>>
>>> Purpose: When a “global assign” kernel mode is active, specify which resctrl group
>>> (CLOSID/RMID) is used for kernel work.
>>>
>>> Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
>>> Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
>>>
>>> The patches are based on top of commit (v7.0.0-rc3)
>>> 839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
>>> ------------------------------------------------------------------------
>>>
>>> Examples: kernel_mode and kernel_mode_assignment
>>>
>>> All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
>>> /sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
>>> must support the relevant modes (e.g. AMD with PLZA).
>>>
>>> 1) kernel_mode — show and set the current kernel mode
>>>
>>> Read supported modes and which one is active (current in brackets):
>>>
>>> $ cat info/kernel_mode
>>> [inherit_ctrl_and_mon]
>>> global_assign_ctrl_inherit_mon
>>> global_assign_ctrl_assign_mon
>>>
>>> Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
>>>
>>> $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>> $ cat info/kernel_mode
>>> inherit_ctrl_and_mon
>>> global_assign_ctrl_inherit_mon
>>> [global_assign_ctrl_assign_mon]
>>>
>>> Mode meanings:
>>> - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
>>> - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
>>> - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
>>>
>>> 2) kernel_mode_assignment — show and set which group is used for kernel work
>>>
>>> Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
>> To help with future usages please connect visibility of this file with the mode in
>> info/kernel_mode. This helps us to support future modes with other resctrl files, possible
>> within each resource group.
>> Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
>> while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.
>
> Sure. Will do.
>
>>
>>> currently assigned group (path format is "CTRL_MON/MON/"):
>> The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
>> then it should only contain a control group, alternatively, if the mode is
>> "global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
>> resctrl future flexibility to change format for future modes.
>
> This can be done both ways. Whole purpose of these groups is to get CLOSID and RMID to enable PLZA. User can echo CTRL_MON or MON group to kernel_mode_assignment in any of the modes. We can decide what needs to be updated in MSR (PQR_PLZA_ASSOC) based on what kernel mode is selected.
The "both ways" are specific to one of the two active modes though.
PLZA only needs the RMID when the mode is "global_assign_ctrl_assign_mon".
Displaying and parsing monitor group when the mode is
"global_assign_ctrl_inherit_mon" creates an inconsistent interface since the mode
only uses a control group. The interface to user space should match the mode otherwise
it becomes confusing.
...
>>>
>>> Tony suggested using global variables to store the kernel mode
>>> CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>> coming from rdtgroup structure with the new interface. Accessing
>>> them requires holding the associated lock, which would make the
>>> context switch path unnecessarily expensive. So, dropped the idea.
>>> https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>> Let me know if there are other ways to optimize this.
>> I do not see why the context switch path needs to be touched at all with this
>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>> info/kernel_mode_assignment?
>
> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID. PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@amd.com/
and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
same for all CPUs on the system, not just per QoS domain. Could you please confirm?
>
> So, PQR_PLZA_ASSOC is a per thread MSR just like PQR_ASSOC.
>
> Privilege-Level Zero Association (PLZA) allows the user to specify a COS and/or RMID associated with execution in Privilege-Level Zero. When enabled on a HW thread, when that thread enters Privilige-Level Zero, transactions associated with that thread will be associated with the PLZA COS and/or RMID. Otherwise, the HW thread will be associated with the COS and RMID identified by PQR_ASSOC.
>
> More below.
>
>>
>> Consider some of the scenarios:
>>
>> resctrl mount with default state:
>>
>> # cat info/kernel_mode
>> [inherit_ctrl_and_mon]
>> global_assign_ctrl_inherit_mon
>> global_assign_ctrl_assign_mon
>> # ls info/kernel_mode_assignment
>> ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>>
>> enable global_assign_ctrl_assign_mon mode:
>> # echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>
>> Expectation here is that when user space sets this mode as above then resctrl would
>> in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>> MSR_IA32_PQR_PLZA_ASSOC.rmid=0
>> MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>> MSR_IA32_PQR_PLZA_ASSOC.closid=0
>> MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>> MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>>
>> I do not see why it is necessary to maintain any per-CPU or per-task state or needing
>> to touch the context switch code. Since PLZA only supports global could it not
>> just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
>> Only caveat is that if a CPU is offline then this setting needs to be stashed
>> so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.
>>
>> The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
>> with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
>> CPU mask seems unnecessary to me as well as error prone since the system may only
>> have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
>> MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?
>>
>> To continue the scenarios ...
>>
>> After user's setting above related files read:
>> # cat info/kernel_mode
>> inherit_ctrl_and_mon
>> global_assign_ctrl_inherit_mon
>> [global_assign_ctrl_assign_mon]
>> # cat info/kernel_mode_assignment
>> //
>>
>> Modify group used by global_assign_ctrl_assign_mon mode:
>> # echo 'ctrl1/mon1/' > info/kernel_mode_assignment
>>
>> Expectation here is that when user space sets this then resctrl would
>> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>> MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
>> MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>> MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
>> MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>> MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>
>
> This works correctly when PLZA associations are defined by per CPU. For example, lets assume that *ctrl1* is assigned *CLOSID 1*.
>
> In this scenario, every task in the system running on a any CPU will use the limits associated with *CLOSID 1* whenever it enters Privilege-Level Zero, because the CPU's *PQR_PLZA_ASSOC* register has PLZA enabled and CLOSID is 1.
>
> Now consider task-based association:
>
> We have two resctrl groups:
>
> * *ctrl1 -> CLOSID 1 -> task1.plza = 1 : *User wants PLZA be enabled
> for this task.
> * *ctrl2 -> CLOSID 2 -> task2.plza = 0 : *User wants PLZA
> disabled for this task.
>
> Suppose *task1* is first scheduled on *CPU 0*. This behaves as expected: since CPU 0 's *PQR_PLZA_ASSOC* contains *CLOSID 1, plza_en =1*, task1 will use the limits from CLOSID 1 when it enters Privilege-Level Zero.
>
> However, if *task2* later runs on *CPU 0*, we expect it to use *CLOSID 2* in both user mode and kernel mode, because user has PLZA disabled for this task. But CPU 0 still has *CLOSID 1, **plza_en =1* in its PQR_PLZA_ASSOC register.
>
> As a result, task2 will incorrectly run with *CLOSID 1* when entering Privilege-Level Zero something we explicitly want to avoid.
>
> At that point, PLZA must be disabled on CPU 0 to prevent the unintended association. Hope this explanation makes the issue clear.
>
A couple of points:
- Looks like we still need to come to agreement what is meant by "global" when it
comes to kernel mode.
In your description there is a "global" configuration, but the assignment is "per-task".
To me this sounds like a new and distinct kernel_mode from the "global" modes
considered so far. This seems to move to the "per_task" mode mentioned in but
the implementation does not take into account any of the earlier discussions
surrounding it:
https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
We only learned about one use case in https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@mail.gmail.com/
As I understand this use case requires PLZA globally enabled for all tasks. Thus
I consider task assignment to be "global" when in the "global_*" kernel modes.
If this is indeed a common use case then supporting only global configuration
but then requiring user space to manually assign all tasks afterwards sounds
cumbersome for user space and also detrimental to system performance with all
the churn to modify all the task_structs involved. The accompanying documentation
does not mention all this additional user space interactions required by user
space to use this implementation.
I find this implementation difficult and inefficient to use in the one use case
we know of. I would suggest that resctrl optimizes for the one known use case.
- This implementation ignores discussion on how existing resctrl files should
not be repurposed.
This implementation allows user space to set a resource group in
kernel_mode_assignment with the consequence that this resource group's
"tasks" file changes behavior. I consider this a break of resctrl interface.
We did briefly consider per-task configuration/assignment in previous discussion
and the proposal was for it to use a new file (only when and if needed!).
- Now a user is required to write the task id of every task that participates
in PLZA. Apart from the churn already mentioned this also breaks existing
usage since it is no longer possible for new tasks to be added to this
resource group. This creates an awkward interface where all tasks belonging
to a resource group inherits the allocations/monitoring for their user space
work and will get PLZA enabled whether user requested it or not while
tasks from other resource groups need to be explicitly enabled. This creates
an inconsistency when it comes to task assignment. The only way to "remove"
PLZA from such a task would be to assign it to another resource group which
may not have the user space allocations ... and once this is done the task
cannot be moved back.
There is no requirement that CLOSID/RMID should be dedicated to kernel work
but this implementation does so in an inconsistent way.
- Apart from the same issues as with repurposing of tasks file, why should same
CPU allocation be used for kernel and user space?
Reinette
prev parent reply other threads:[~2026-03-27 22:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-12 20:36 [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem Babu Moger
2026-03-12 20:36 ` [PATCH v2 01/16] fs/resctrl: Add kernel mode (kmode) data structures and arch hook Babu Moger
2026-03-24 22:51 ` Reinette Chatre
2026-03-26 18:41 ` Babu Moger
2026-03-12 20:36 ` [PATCH v2 02/16] fs, x86/resctrl: Add architecture routines for kernel mode initialization Babu Moger
2026-03-24 22:53 ` Reinette Chatre
2026-03-26 19:10 ` Babu Moger
2026-03-12 20:36 ` [PATCH v2 03/16] fs/resctrl: Add info/kernel_mode file to show kernel mode options Babu Moger
2026-03-12 20:36 ` [PATCH v2 04/16] x86/resctrl: Support Privilege-Level Zero Association (PLZA) Babu Moger
2026-03-12 20:36 ` [PATCH v2 05/16] x86/resctrl: Initialize supported kernel modes when CPUID reports PLZA Babu Moger
2026-03-12 20:36 ` [PATCH v2 06/16] resctrl: Introduce kmode static key enable/disable helpers Babu Moger
2026-03-12 20:36 ` [PATCH v2 07/16] x86/resctrl: Add data structures and definitions for PLZA configuration Babu Moger
2026-03-12 20:36 ` [PATCH v2 08/16] x86/resctrl: Add per-CPU and per-task kernel mode state Babu Moger
2026-03-12 20:36 ` [PATCH v2 09/16] x86,fs/resctrl: Add the functionality to configure PLZA Babu Moger
2026-03-12 20:36 ` [PATCH v2 10/16] x86/resctrl: Add PLZA state tracking and context switch handling Babu Moger
2026-03-12 20:36 ` [PATCH v2 11/16] fs/resctrl: Add write handler for info/kernel_mode Babu Moger
2026-03-12 20:36 ` [PATCH v2 12/16] fs/resctrl: Add info/kernel_mode_assignment to show kernel-mode rdtgroup Babu Moger
2026-03-12 20:36 ` [PATCH v2 13/16] fs/resctrl: Add write interface for kernel_mode_assignment Babu Moger
2026-03-12 20:36 ` [PATCH v2 14/16] fs/resctrl: Update kmode configuration when cpu_mask changes Babu Moger
2026-03-12 20:37 ` [PATCH v2 15/16] x86/resctrl: Refactor show_rdt_tasks() to support PLZA tasks Babu Moger
2026-03-12 20:37 ` [PATCH v2 16/16] fs/resctrl: Add per-task kmode enable support via rdtgroup Babu Moger
2026-03-24 6:15 ` [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem Askar Safin
2026-03-24 22:51 ` Reinette Chatre
2026-03-26 17:12 ` Babu Moger
2026-03-27 22:11 ` Reinette Chatre [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88eebfac-5286-4788-b244-911c659c0439@intel.com \
--to=reinette.chatre@intel.com \
--cc=Dave.Martin@arm.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=akpm@linux-foundation.org \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=bsegall@google.com \
--cc=chang.seok.bae@intel.com \
--cc=corbet@lwn.net \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dietmar.eggemann@arm.com \
--cc=elena.reshetova@intel.com \
--cc=elver@google.com \
--cc=eranian@google.com \
--cc=fvdl@google.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=juri.lelli@redhat.com \
--cc=kas@kernel.org \
--cc=kees@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=peternewman@google.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rdunlap@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rostedt@goodmis.org \
--cc=safinaskar@gmail.com \
--cc=seanjc@google.com \
--cc=skhan@linuxfoundation.org \
--cc=tglx@kernel.org \
--cc=thomas.lendacky@amd.com \
--cc=tiala@microsoft.com \
--cc=tony.luck@intel.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=x86@kernel.org \
--cc=xin@zytor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox