linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Peter Newman <peternewman@google.com>,
	"Luck, Tony" <tony.luck@intel.com>
Cc: "Moger, Babu" <bmoger@amd.com>,
	"babu.moger@amd.com" <babu.moger@amd.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"dave.martin@arm.com" <dave.martin@arm.com>,
	"fenghuay@nvidia.com" <fenghuay@nvidia.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"paulmck@kernel.org" <paulmck@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"thuth@redhat.com" <thuth@redhat.com>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"daniel.sneddon@linux.intel.com" <daniel.sneddon@linux.intel.com>,
	"jpoimboe@kernel.org" <jpoimboe@kernel.org>,
	"alexandre.chartre@oracle.com" <alexandre.chartre@oracle.com>,
	"pawan.kumar.gupta@linux.intel.com"
	<pawan.kumar.gupta@linux.intel.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"perry.yuan@amd.com" <perry.yuan@amd.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"Huang, Kai" <kai.huang@intel.com>,
	"Li, Xiaoyao" <xiaoyao.li@intel.com>,
	"kan.liang@linux.intel.com" <kan.liang@linux.intel.com>,
	"Li, Xin3" <xin3.li@intel.com>,
	"ebiggers@google.com" <ebiggers@google.com>,
	"xin@zytor.com" <xin@zytor.com>,
	"Mehta, Sohil" <sohil.mehta@intel.com>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	"mario.limonciello@amd.com" <mario.limonciello@amd.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>,
	"Eranian, Stephane" <eranian@google.com>,
	"Xiaojian.Du@amd.com" <Xiaojian.Du@amd.com>,
	"gautham.shenoy@amd.com" <gautham.shenoy@amd.com>
Subject: Re: [PATCH v13 00/27] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Date: Thu, 22 May 2025 09:32:42 -0700	[thread overview]
Message-ID: <fa78c5e6-582c-43fd-a0c0-5b6a4439b0e2@intel.com> (raw)
In-Reply-To: <CALPaoCjh_NXQLtNBqei=7a6Jsr17fEnPO+kqMaNq4xNu2UPDJA@mail.gmail.com>

Hi Peter,

On 5/22/25 1:47 AM, Peter Newman wrote:
> Hi Tony, Reinette,
> 
> On Thu, May 22, 2025 at 2:21 AM Luck, Tony <tony.luck@intel.com> wrote:
>>
>>>>>> There's also the mongroup-RMID overcommit use case I described
>>>>>> above[1]. On Intel we can safely assume that there are counters to
>>>>>> back all RMIDs, so num_mbm_cntrs would be calculated directly from
>>>>>> num_rmids.
>>>>>
>>>>> This is about the:
>>>>>    There's now more interest in Google for allowing explicit control of
>>>>>    where RMIDs are assigned on Intel platforms. Even though the number of
>>>>>    RMIDs implemented by hardware tends to be roughly the number of
>>>>>    containers they want to support, they often still need to create
>>>>>    containers when all RMIDs have already been allocated, which is not
>>>>>    currently allowed. Once the container has been created and starts
>>>>>    running, it's no longer possible to move its threads into a monitoring
>>>>>    group whenever RMIDs should become available again, so it's important
>>>>>    for resctrl to maintain an accurate task list for a container even
>>>>>    when RMIDs are not available.
>>>>>
>>>>> I see a monitor group as a collection of tasks that need to be monitored together.
>>>>> The "task list" is the group of tasks that share a monitoring ID that
>>>>> is required to be a valid ID since when any of the tasks are scheduled that ID is
>>>>> written to the hardware. I intentionally tried to not use RMID since I believe
>>>>> this is required for all archs.
>>>>> I thus do not understand how a task can start running when it does not have
>>>>> a valid monitoring ID. The idea of "deferred assignment" is not clear to me,
>>>>> there can never be "unmonitored tasks", no? I think I am missing something here.
> 
> You are correct. I did forget to mention something...
> 
>>>>
>>>> In the AMD/RMID implemenentation this might be achieved with something
>>>> extra in the task structure to denote whether a task is in a monitored
>>>> group or not. E.g. We add "task->rmid_valid" as well as "task->rmid".
>>>> Tasks in an unmonitored group retain their "task->rmid" (that's what
>>>> identifies them as a member of a group) but have task->rmid_valid set
>>>> to false.  Context switch code would be updated to load "0" into the
>>>> IA32_PQR_ASSOC.RMID field for tasks without a valid RMID. So they
>>>> would still be monitored, but activity would be bundled with all
>>>> tasks in the default resctrl group.
>>>>
>>>> Presumably something analogous could be done for ARM/MPAM.
>>>>
>>>
>>> I do not interpret this as an unmonitored task but instead a task that
>>> belongs to the default resource group. Specifically, any data accumulated by
>>> such a task is attributed to the default resource group. Having tasks
>>> in a separate group but their monitoring data accumulating in/contributed to
>>> the default resource group (that has its own set of tasks) sounds wrong to me.
>>> Such an implementation makes any monitoring data of default resource group
>>> invalid, and by extension impossible to use default resource group to manage
>>> an allocation for a group of monitor groups if user space needs insight
>>> in monitoring data across all these monitor groups. User space will need to
>>> interact with resctrl differently and individually query monitor groups instead
>>> of CTRL_MON group once.
>>
>> Maybe assign one of the limited supply of RMIDs for these "unmonitored"
>> tasks. Populate a resctrl group named "unmonitored" that lists all the
>> unmonitored tasks in a (read-only) "tasks" file. And supply all the counts
>> for these tasks in normal looking "mon_data" directory.
> 
> I needed to switch to an rdtgroup struct pointer rather than hardware
> IDs in the task structure to indicate group membership[1], otherwise
> it's not possible to determine which tasks are in a group when it
> doesn't have a unique HW ID value.

Whether the task struct contains a pointer (albeit accompanied with its
own complexities) does not address the issue that I am concerned about.

Looking at [1] I expect this new feature handles "unmonitored" groups by
placing them in the default monitoring group, following Tony's first [3]
suggestion.

When considering [1] by itself in the context of current resctrl all tasks
should be members of resource groups that have valid HW monitoring IDs allocated.
Using the default resource group in this way seems like addressing edge cases
where pointer is not yet valid (unclear what these scenarios may be) instead of
routing many tasks to the default group. I am not sure and I'll have to study
that change closer to reason accurately.

From what I understand the new proposal that builds on [1] involves creating
new monitor groups that are "unmonitored" for any length of time and when backed
by the implementation in [1] this would mean these groups will actually
still be monitored but the data attributed to the default resource group.

As I mentioned in response [4] to Tony this fundamentally changes the
behavior users can expect from the default resource group. In addition,
this breaks the first of the "Resource monitoring rules" from
Documentation/filesystems/resctrl.rst:

1) If a task is a member of a MON group, or non-default CTRL_MON group          
   then RDT events for the task will be reported in that group.  

How does this fit with the ABMC work? I continue to think that I am missing
parts of the discussion as it seems this new feature discussion mixed in
with ABMC work.

Reinette

> 
> Also this is required for shared assignment so that changing a group's
> IDs in a domain only requires updating running tasks rather than
> needing to search the entire task list, which would lead to the same
> problem we encountered in mongroup rename[2].
> 
> -Peter
> 
> [1] https://lore.kernel.org/lkml/20240325172707.73966-5-peternewman@google.com/
> [2] https://lore.kernel.org/lkml/CALPaoCh0SbG1+VbbgcxjubE7Cc2Pb6QqhG3NH6X=WwsNfqNjtA@mail.gmail.com/
[3] https://lore.kernel.org/lkml/aC5lL_qY00vd8qp4@agluck-desk3/
[4] https://lore.kernel.org/lkml/a131e8ed-88b2-4fed-983b-5deea955a9a5@intel.com/

  reply	other threads:[~2025-05-22 16:33 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-15 22:51 [PATCH v13 00/27] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-05-15 22:51 ` [PATCH v13 01/27] x86/cpufeatures: Add support for " Babu Moger
2025-05-22 20:51   ` Reinette Chatre
2025-05-27 17:23     ` Moger, Babu
2025-05-27 17:54       ` Reinette Chatre
2025-05-27 18:40         ` Moger, Babu
2025-05-27 23:42           ` Reinette Chatre
2025-05-28 16:18             ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 02/27] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-05-15 22:51 ` [PATCH v13 03/27] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-05-22 20:52   ` Reinette Chatre
2025-05-27 18:49     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 04/27] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-05-22 20:54   ` Reinette Chatre
2025-05-27 19:52     ` Moger, Babu
2025-05-27 20:15     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 05/27] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-05-22 20:56   ` Reinette Chatre
2025-05-27 20:21     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 06/27] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2025-05-22 20:56   ` Reinette Chatre
2025-05-27 20:33     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 07/27] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
2025-05-15 22:51 ` [PATCH v13 08/27] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
2025-05-22 21:02   ` Reinette Chatre
2025-05-28 16:56     ` Moger, Babu
2025-05-28 17:34       ` Reinette Chatre
2025-05-28 19:05         ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 09/27] x86/resctrl: Introduce interface to display number of free MBM counters Babu Moger
2025-05-15 22:51 ` [PATCH v13 10/27] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-05-22 21:10   ` Reinette Chatre
2025-05-28 19:15     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 11/27] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-05-22 21:51   ` Reinette Chatre
2025-05-22 22:16     ` Luck, Tony
2025-05-23 21:08       ` Luck, Tony
2025-05-26 13:14         ` Peter Newman
2025-05-27 21:41           ` Luck, Tony
2025-05-28 21:41             ` Moger, Babu
2025-05-28 22:00               ` Luck, Tony
2025-05-28 22:13                 ` Luck, Tony
2025-05-28 23:48                   ` Moger, Babu
2025-06-09 14:01               ` Moger, Babu
2025-05-28 21:39     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 12/27] x86/resctrl: Introduce event configuration modes Babu Moger
2025-05-22 22:05   ` Reinette Chatre
2025-05-29 15:21     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 13/27] x86/resctrl: Add the functionality to assign MBM events Babu Moger
2025-05-22 22:41   ` Reinette Chatre
2025-05-29 16:05     ` Moger, Babu
2025-05-15 22:51 ` [PATCH v13 14/27] x86/resctrl: Add the functionality to unassign " Babu Moger
2025-05-22 22:49   ` Reinette Chatre
2025-05-29 16:25     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 15/27] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode Babu Moger
2025-05-22 23:01   ` Reinette Chatre
2025-05-29 16:58     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 16/27] x86/resctrl: Pass entire struct rdtgroup rather than passing individual members Babu Moger
2025-05-22 23:05   ` Reinette Chatre
2025-05-29 18:07     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 17/27] x86/resctrl: Add the support for reading ABMC counters Babu Moger
2025-05-22 23:31   ` Reinette Chatre
2025-05-29 18:25     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 18/27] x86/resctrl: Add definitions for MBM event configuration Babu Moger
2025-05-23  4:41   ` Reinette Chatre
2025-05-29 19:00     ` Moger, Babu
2025-05-29 20:58       ` Reinette Chatre
2025-06-03 13:41         ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 19/27] x86/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
2025-05-23  4:43   ` Reinette Chatre
2025-05-29 19:54     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 20/27] x86/resctrl: Provide interface to update the event configurations Babu Moger
2025-05-23  4:45   ` Reinette Chatre
2025-05-29 22:35     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 21/27] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments Babu Moger
2025-05-23  4:48   ` Reinette Chatre
2025-05-29 23:03     ` Moger, Babu
2025-05-30 20:54       ` Reinette Chatre
2025-06-03 14:00         ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 22/27] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
2025-05-15 22:52 ` [PATCH v13 23/27] x86/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
2025-05-23  4:47   ` Reinette Chatre
2025-05-30  0:55     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 24/27] x86/resctrl: Introduce the interface to modify " Babu Moger
2025-05-26  9:48   ` Peter Newman
2025-05-27 15:24     ` Moger, Babu
2025-05-15 22:52 ` [PATCH v13 25/27] x86/resctrl: Hide the BMEC related files when mbm_cnt_assign is enabled Babu Moger
2025-05-15 22:52 ` [PATCH v13 26/27] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-05-15 22:52 ` [PATCH v13 27/27] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
2025-05-19 15:59 ` [PATCH v13 00/27] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Peter Newman
2025-05-20 15:28   ` Moger, Babu
2025-05-20 16:06     ` Reinette Chatre
2025-05-20 17:51       ` Moger, Babu
2025-05-20 18:23         ` Reinette Chatre
2025-05-20 23:25           ` Moger, Babu
2025-05-20 23:44             ` Reinette Chatre
2025-05-21  9:18               ` Peter Newman
2025-05-21 23:03                 ` Reinette Chatre
2025-05-21 23:43                   ` Luck, Tony
2025-05-22  0:10                     ` Reinette Chatre
2025-05-22  0:21                       ` Luck, Tony
2025-05-22  8:47                         ` Peter Newman
2025-05-22 16:32                           ` Reinette Chatre [this message]
2025-05-22 17:21                           ` Luck, Tony
2025-05-22 15:44                   ` Moger, Babu
2025-05-22 16:33                     ` Reinette Chatre
2025-05-22 19:15                       ` Moger, Babu
2025-06-10 23:19                       ` Moger, Babu
2025-06-11 18:29                         ` Reinette Chatre
2025-06-11 21:21                           ` Moger, Babu
2025-05-21 14:27               ` Peter Newman
2025-05-21 23:05                 ` Reinette Chatre
2025-05-22  9:14                   ` Peter Newman
2025-05-22 16:33                     ` Reinette Chatre
2025-05-22 20:44 ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa78c5e6-582c-43fd-a0c0-5b6a4439b0e2@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Xiaojian.Du@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexandre.chartre@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ardb@kernel.org \
    --cc=babu.moger@amd.com \
    --cc=bmoger@amd.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=daniel.sneddon@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.martin@arm.com \
    --cc=ebiggers@google.com \
    --cc=eranian@google.com \
    --cc=fenghuay@nvidia.com \
    --cc=gautham.shenoy@amd.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jpoimboe@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=perry.yuan@amd.com \
    --cc=peternewman@google.com \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=thuth@redhat.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=xin3.li@intel.com \
    --cc=xin@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).